View Source CliexMap (CliexMap v0.2.0)
It is a Unix filter cliex_map
that applies a pattern to each input line.
This is an oppinionated subset of sed with many built in features trimmed for
file manipulation (move, rename, format, timestamps) field and substring
extraction à la awk
but much more concise and less powerful.
The design goal is to have a powerful minilanguage to describe patterns in one line mostly. We shall see where this journey shall lead ?)
Let us document how these patterns are compiled, and how the compiled form is rendered. Here is a pimitive example which demonstrates this:
iex(1)> run(~w[1 2], "Hello")
["Hello", "Hello"]
As a pattern always starts with %
we can render literal %
s by doubling them
iex(2)> run(~w[alpha], "%% %%")
["% %"]
N.B. that end of lines are added if not present
Patterns
The patterns shown above are literals and represented as binaries in the compiled ast
Fields: %
and %<integer>
The first pattern we are describing here is also the most used it corresponds to awk's $0
, $1
and so forth
but there is a little bit more here.
Instead of
$
we use%
(for simpler shell integration, inspored by xargs)We can write
%
as shortcut for%0
and we can write%-1
, for awk's$NF
%-2
for$(NF-1)
and so onWe can substring, format and modify
%f<n>
as all other patterns, which will show in the corresponding sections
Here is a simple echo pattern with and without the explicit 0
iex(3)> run(~w[alpha beta], "%")
~w[alpha beta]
N.B. that M
is a shortcut for CliexMap.Modifiers
defined in the doctest for readability
And here the (double) echo with an explicit 0
iex(4)> run(~w[alpha], "% %0")
["alpha alpha"]
If there is only one field than %1
is, again, the synonyme of %
iex(5)> run(["alpha", "beta gamma"], "% %1")
["alpha alpha", "beta gamma beta"]
Now let us show how negative indices and indices of non existing fields are rendered
iex(6)> sentence = "The quick brown fox jumps"
...(6)> run([sentence], "%-1 %-2 '%6'")
["jumps fox ''"]
Line Numbers: %n
This is rendered like awk's $NR - 1
iex(7)> run(["", "", "ignored"], "%n")
~w[0 1 2]
Timestamps: %ts
, %tms
, %tmics
, "%xs", %xms
and %xmics
These fields are rendererd with the unix timestamp in seconds, milliseconds or microseconds with the same timestamp for all input lines, tusly mapping it for each line to a constant value.
iex(8)> run([""], "%ts", C.for_now(1691231907123456))
["1691231907"]
N.B. that C
is a shortcut for CliexMap.Context
defined in the doctest for readability
and the context will be generated when the pattern is compiled. As a consequence %ts
and
friends will all be rendered itentically for each input line, here are examples for the other
5 timestamp formats.
iex(9)> run([""], "%tms", C.for_now(1691231907123456))
["1691231907123"]
iex(10)> run(["", "", ""], "%tmics", C.for_now(1691231907123456))
["1691231907123456", "1691231907123456", "1691231907123456"]
iex(11)> run([""], "%xs", C.for_now(1691231907123456))
["64ce26a3"]
iex(12)> run([""], "%xms", C.for_now(1691231907123456))
["189c546ed33"]
iex(13)> run(["", "", ""], "%xmics", C.for_now(1691231907123456))
["6022a9d0e9100", "6022a9d0e9100", "6022a9d0e9100"]
Variables
While, up to here, patterns have been simple we will see later that they can become quite complicated. Without explaining the details of the following examples we will see that we need to rewrite a long pattern to create the desired command for each input line
iex(14)> run(["src/DIR/subdir/file.jsno"],
...(14)> ~s{mkdir -p bup/%xs/%(segments 1 -2)(downcase); cp % bup/%xs/%(segments 1 -2)(downcase)/%(segments -1)(sub ".jsno" ".json")},
...(14)> C.for_now(1691231907123456))
["mkdir -p bup/64ce26a3/dir/subdir; cp src/DIR/subdir/file.jsno bup/64ce26a3/dir/subdir/file.json"]
Actually the patterns above are a little bit more complicated as needed but this is to demonstrate the usage of Variables
Setting a variable with %S
The %S<varname><pattern>
will compile <pattern>
but instead of rendering its result it will store it in the context, thusly the
following is useless
iex(15)> run(["useless input"], "%Saa<%(split)(revlist)(join)>")
[""]
We can, however use the variable after its declaration, and that, many times
iex(16)> run(["A B"], "%Sx<%(downcase)(split)(revlist)(join)>-%Lx-%Lx-%Lx-")
["-ba-ba-ba-"]
Of course a loaded variable can be modified as every other field
iex(17)> run(["A B"], "%Sx<%(downcase)(split)(revlist)(join)>-%Lx-%Lx(upcase)")
["-ba-BA"]
Variables can also be used to store constant strings
iex(18)> run([""], "%Sworld<Brave new world>%Lworld %Lworld(split)(at 1)")
["Brave new world new"]
Pattern Modifiers
All patterns can be modified by a little, lisp like minilanguage directly attached to the pattern. This LLLML is expressed in form of s-expressions with built in predefined functions.
The value of the field is injected into the first such s-expression, the result of which is injected into the second s-expression and so forth, here are some examples to demonstrate the principle
iex(19)> run(["alpha"], "%(reverse)")
["ahpla"]
Below we will list all available built in functions, but here we will also show the more common use cases like splitting, extracting and combining segments of strings, formatting output and the shortcut syntax that allows us to write common used modifiers in an even shorter form than LLLML
Counting Modifiers and their shortcuts
Let us imagien that we want to preceede line numbers by their tenfolds and want to start with 10, what we can do of course is to use LLLML as follows
iex(20)> run(["a", "a", "a", "a"], "%n(+ 1)(* 10) %")
["10 a", "20 a", "30 a", "40 a"]
There are cases where we need the injected line number to be the second argument of a function, e.g.
if we want to count down. We can indicate the position where a value shall be injected by the placeholder _
iex(21)> run(["b", "b", "b"], "%n(* -10)(+ 110) %")
["110 b", "100 b", "90 b"]
This is obviously a handful, and therefore we provide a shortcut with a simple value notation
iex(22)> run(["b", "b", "b"], "%n:110,-10: %")
["110 b", "100 b", "90 b"]
And the one value version can be used to change the start count, 1 being a popular choice ;)
iex(23)> run(["b", "b", "e"], "%n:1: %")
["1 b", "2 b", "3 e"]
Parameter Positions
Oftentimes we will use the result of a field as the first argument in a modifier, however, sometimes we would need to inject it at a different position.
We can use the placeholde _
for this.
We can demonstrate with an explicit implementation of counting down line numbers (which of course is done with the shortcuts shown above
iex(24)> run(["", "", ""], "%n(- 10 _)")
["10", "9", "8"]
Cookbooks
File Trees
Duplicate a directory structure
We have a structure like that
src
|
+ namespace_1
| |
| +---- file1.json, ...., filen.json
|
+ namespace_2
| |
| +---- file1.json, ...., filen.json
|
+ namespace_3
|
+---- file1.json, ...., filen.json
And want to create a structure like that
tests
|
json_tests
|
+ namespace_1
| |
| +---- file1_test.exs, ...., filen_test.exs
|
+ namespace_2
| |
| +---- file1_test.exs, ...., filen_test.exs
|
+ namespace_3
|
+---- file1_test.exs, ...., filen_test.exs
By using the output from, e.g. git ls-files src
or find src -name '*.json'
we can map the input with the following pattern
iex(25)> input = ~W[ src/namespace_1/file1.json src/namespace_2/file1.json src/namespace_2/file2.json ]
...(25)> run(input, "mkdir -p tests/json_tests/%(segment 1); touch tests/json_tests/%(segments 1 2)(sub '.json' '_test.exs')")
[
"mkdir -p tests/json_tests/namespace_1; touch tests/json_tests/namespace_1/file1_test.exs",
"mkdir -p tests/json_tests/namespace_2; touch tests/json_tests/namespace_2/file1_test.exs",
"mkdir -p tests/json_tests/namespace_2; touch tests/json_tests/namespace_2/file2_test.exs"
]
Comprehensive List of Builtin Functions
We use B
as a shortcut to CliexMap.Builtins
, we also pass in a CliexMap.Context
(aliased to C
)
only if needed
abs
iex(26)> B.abs(nil, [-10])
10
add (used for +
)
iex(27)> B._add(nil, [1, 2, 3, 4])
10
at (alias to Enum.at)
iex(28)> B.at(nil, [~w[a b c], 2])
"c"
iex(29)> B.at(nil, [~w[a b c], -3])
"a"
iex(30)> B.at(nil, [~w[a b c], 1, :default])
"b"
iex(31)> B.at(nil, [~w[a b c], 3, :default])
:default
div (used for /
)
iex(32)> B._div(nil, [9, 4])
2.25
downcase
iex(33)> B.downcase(nil, ["HELLO"])
"hello"
idiv (used for :
)
iex(34)> B._idiv(nil, [9, 4])
2
join
iex(35)> B.join(nil, [[1, 2]])
"12"
iex(36)> B.join(nil, [[1, 2, 3], " "])
"1 2 3"
mul
iex(37)> B._mul(nil, [1, 2, 3, 4])
24
reverse
iex(38)> B.reverse(nil, ["alpha"])
"ahpla"
revlist
iex(39)> B.revlist(nil, [~W[a b]])
~W[b a]
segment Extract a path
with -1 like
basename
iex(40)> B.segment(nil, ["a/b/c", -1]) "c"without an arg like
dirname
iex(41)> B.segment(nil, ["a/b/c"]) "a/b"or any part
iex(42)> B.segment(nil, ["a/b/c", 0]) "a"
iex(43)> B.segment(nil, ["a/b/c", -2]) "b"
segments (short for (splicej "/" _1 _2)
)
iex(44)> B.segments(nil, ["x/y/z", 2])
"z"
iex(45)> B.segments(nil, ["x/y/z", -2, -1])
"y/z"
slice
iex(46)> B.slice(nil, [~w[a b c d], 1])
~w[b c d]
iex(47)> B.slice(nil, [~w[a b c d], 2, 3])
~w[c d]
iex(48)> B.slice(nil, [~w[a b c d], -2])
~w[c d]
iex(49)> B.slice(nil, [~w[a b c d], -4, -2])
~w[a b c]
iex(50)> B.slice(nil, [~w[a b c d], -4, 1])
~w[a b]
iex(51)> B.slice(nil, [~w[a b c d], 1, 0])
[]
splicej (or splice_join) split, splice and join
join on original seperator
iex(52)> B.splicej(nil, ["a/b/c/d/e", "/", 2, 3]) "c/d"
join on original seperator, slice to the end
iex(53)> B.splicej(nil, ["a/b/c/d/e", "/", 2]) "c/d/e"
join with different string
iex(54)> B.splice_join(nil, ["a/b/c/d/e", "/", 2, 3, ","]) "c,d"
split
iex(55)> B.split(nil, ["a b"])
["a", "b"]
iex(56)> B.split(nil, ["a,b", ","])
["a", "b"]
sub (from -
) (use with care)
iex(57)> B._sub(nil, [1, 2, 3, 4])
-8
sub (from sub
)
iex(58)> B.sub(nil, ["Hello World", "l"])
"Heo Word"
iex(59)> B.sub(nil, ["Hello World", "l", "L"])
"HeLLo WorLd"
upcase
iex(60)> B.upcase(nil, ["hello"])
"HELLO"