View Source CliexMap (CliexMap v0.2.7-pre3)
It is a Unix filter cliex_map
that applies a pattern to each input line.
This is an oppinionated subset of sed with many built in features trimmed for
file manipulation (move, rename, format, timestamps) field and substring
extraction à la awk
but much more concise and less powerful.
For more help you can use the following commands:
help pattern
help builtin
The design goal is to have a powerful minilanguage to describe patterns in one line mostly. We shall see where this journey shall lead ?) Let us document how these patterns are compiled, and how the compiled form is rendered. Here is a pimitive example which demonstrates this:
iex(1)> run(~w[1 2], "Hello")
["Hello", "Hello"]
As a pattern always starts with %
we can render literal %
s by doubling them
iex(2)> run(~w[alpha], "%% %%")
["% %"]
N.B. that end of lines are added if not present
## Patterns
The patterns shown above are literals and represented as binaries in the compiled ast
### Fields: %
and %<integer>
The first pattern we are describing here is also the most used it corresponds to awk's $0
, $1
and so forth
but there is a little bit more here.
Instead of
$
we use%
(for simpler shell integration, inspored by xargs)We can write
%
as shortcut for%0
and we can write%-1
, for awk's$NF
%-2
for$(NF-1)
and so onWe can substring, format and modify
%f<n>
as all other patterns, which will show in the corresponding sections
Here is a simple echo pattern with and without the explicit 0
iex(3)> run(~w[alpha beta], "%")
~w[alpha beta]
N.B. that M
is a shortcut for CliexMap.Modifiers
defined in the doctest for readability
And here the (double) echo with an explicit 0
iex(4)> run(~w[alpha], "% %0")
["alpha alpha"]
If there is only one field than %1
is, again, the synonyme of %
iex(5)> run(["alpha", "beta gamma"], "% %1")
["alpha alpha", "beta gamma beta"]
Now let us show how negative indices and indices of non existing fields are rendered
iex(6)> sentence = "The quick brown fox jumps"
...(6)> run([sentence], "%-1 %-2 '%6'")
["jumps fox ''"]
### Line Numbers: %n
This is rendered like awk's $NR - 1
iex(7)> run(["", "", "ignored"], "%n")
~w[0 1 2]
### Timestamps: %ts
, %tms
, %tmics
, "%xs", %xms
and %xmics
These fields are rendererd with the unix timestamp in seconds, milliseconds or microseconds with the same timestamp for all input lines, tusly mapping it for each line to a constant value.
iex(8)> run([""], "%ts", C.for_now(1691231907123456))
["1691231907"]
N.B. that C
is a shortcut for CliexMap.Context
defined in the doctest for readability
and the context will be generated when the pattern is compiled. As a consequence %ts
and
friends will all be rendered itentically for each input line, here are examples for the other
5 timestamp formats.
iex(9)> run([""], "%tms", C.for_now(1691231907123456))
["1691231907123"]
iex(10)> run(["", "", ""], "%tmics", C.for_now(1691231907123456))
["1691231907123456", "1691231907123456", "1691231907123456"]
iex(11)> run([""], "%xs", C.for_now(1691231907123456))
["64ce26a3"]
iex(12)> run([""], "%xms", C.for_now(1691231907123456))
["189c546ed33"]
iex(13)> run(["", "", ""], "%xmics", C.for_now(1691231907123456))
["6022a9d0e9100", "6022a9d0e9100", "6022a9d0e9100"]
### Conditional Patterns
These are patterns that filter out lines from the input if they return falsy, they also shortcut, so
you can pipeline as in a &&
expression.
#### Comparing Numerical Values (autoconversion)
iex(14)> run(["1", "2", "1"], "%(ifge 2)")
[""]
But if you need the value you have to repeat it
iex(15)> run(["1", "2"], "%(ifgt 1)%(to_i)(+ 1)")
["3"]
And of course we have all these variations
iex(16)> run(["1", "2"], "%(iflt 2)%(to_i)(+ 1)")
["2"]
iex(17)> run(["1", "2"], "%(ifle 2)%(to_i)(+ 1)")
["2", "3"]
iex(18)> run(["1", "2"], "%(ifge 3)")
[]
iex(19)> run(["a 0", "b 1"], "%2(ifeq 0)%1")
["a"]
iex(20)> run(["a 0", "b 1"], "%2(ifne 0)%1")
["b"]
however the rgx
function also produces output
#### Regex Pattern Matches, filter out if no match
This builtin will also filter out lines that are not matched unless a replacement string is provided. In its simplest form the whole match of the field is returned
iex(21)> run(["a", "12"], ~s{%(rgx "[[:digit:]]+")})
["12"]
Or we can just get a capture group
iex(22)> run(["a", "12"], ~s{%(rgx "(.)([[:digit:]])" 2)})
["2"]
If we want to use the rgx
function as a pure filter we can simply specify a capture
group that does not exist
iex(23)> run(["a", "12"], ~s{%(rgx "(.)([[:digit:]])" 3)-> %})
["-> 12"]
And we can provide a default value
iex(24)> run(["a", "12"], ~s{%(rgx "(.)([[:digit:]])" 2 "no digit found")})
["no digit found", "2"]
Which we can also do if we want to match the whole regex
iex(25)> run(["a", "12"], ~s{%(rgx "[[:digit:]]+" oh_no)})
["oh_no", "12"]
### Variables
While, up to here, patterns have been simple we will see later that they can become quite complicated. Without explaining the details of the following examples we will see that we need to rewrite a long pattern to create the desired command for each input line
iex(26)> run(["src/DIR/subdir/file.jsno"],
...(26)> ~s{mkdir -p bup/%xs/%(segments 1 -2)(downcase); cp % bup/%xs/%(segments 1 -2)(downcase)/%(segments -1)(sub ".jsno" ".json")},
...(26)> C.for_now(1691231907123456))
["mkdir -p bup/64ce26a3/dir/subdir; cp src/DIR/subdir/file.jsno bup/64ce26a3/dir/subdir/file.json"]
Actually the patterns above are a little bit more complicated as needed but this is to demonstrate the usage of Variables
#### Setting a variable with %S
The %S<varname><pattern>
will compile <pattern>
but instead of rendering its result it will store it in the context, thusly the
following is useless
iex(27)> run(["useless input"], "%Saa<%(split)(revlist)(join)>")
[""]
We can, however use the variable after its declaration, and that, many times
iex(28)> run(["A B"], "%Sx<%(downcase)(split)(revlist)(join)>-%Lx-%Lx-%Lx-")
["-ba-ba-ba-"]
Of course a loaded variable can be modified as every other field
iex(29)> run(["A B"], "%Sx<%(downcase)(split)(revlist)(join)>-%Lx-%Lx(upcase)")
["-ba-BA"]
Variables can also be used to store constant strings
iex(30)> run([""], "%Sworld<Brave new world>%Lworld %Lworld(split)(at 1)")
["Brave new world new"]
## Pattern Modifiers
All patterns can be modified by a little, lisp like minilanguage directly attached to the pattern. This LLLML is expressed in form of s-expressions with built in predefined functions.
The value of the field is injected into the first such s-expression, the result of which is injected into the second s-expression and so forth, here are some examples to demonstrate the principle
iex(31)> run(["alpha"], "%(reverse)")
["ahpla"]
Below we will list all available built in functions, but here we will also show the more common use cases like splitting, extracting and combining segments of strings, formatting output and the shortcut syntax that allows us to write common used modifiers in an even shorter form than LLLML
### Counting Modifiers and their shortcuts
Let us imagien that we want to preceede line numbers by their tenfolds and want to start with 10, what we can do of course is to use LLLML as follows
iex(32)> run(["a", "a", "a", "a"], "%n(+ 1)(* 10) %")
["10 a", "20 a", "30 a", "40 a"]
There are cases where we need the injected line number to be the second argument of a function, e.g.
if we want to count down. We can indicate the position where a value shall be injected by the placeholder _
iex(33)> run(["b", "b", "b"], "%n(* -10)(+ 110) %")
["110 b", "100 b", "90 b"]
This is obviously a handful, and therefore we provide a shortcut with a simple value notation
iex(34)> run(["b", "b", "b"], "%n:110,-10: %")
["110 b", "100 b", "90 b"]
And the one value version can be used to change the start count, 1 being a popular choice ;)
iex(35)> run(["b", "b", "e"], "%n:1: %")
["1 b", "2 b", "3 e"]
### Formatting Modifiers and their shortcuts
When inserting strings or numbers into the output we oftentimes need to align them to a length, we also sometimes want to pad them with spaces, zeroes or different characters. This all, of course, can be done with LLLML but as we will show here, shorter patterns are desired and therefore implemented.
#### Aligning strings and numbers
iex(36)> run(~w[alpha beta], "%(rpad 6)")
["alpha ", "beta "]
And the shorter
iex(37)> run(~w[alpha beta], "%<-6>")
["alpha ", "beta "]
Or
iex(38)> run(~w[alpha beta], "%(lpad 6 -)")
["-alpha", "--beta"]
And the shorter
iex(39)> run(~w[alpha beta], "%<6->")
["-alpha", "--beta"]
iex(40)> run(["", ""], "%n<2 0>") # Note the space in order to avoid a length of 20
["00", "01"]
#### Formatting Numbers
iex(41)> run(["", ""], "%n:15:(to_s 16)(lpad 2 0)")
["0f", "10"]
And the shorter
iex(42)> run(["", ""], "%n:15:<2x0>")
["0f", "10"]
### Parameter Positions
Oftentimes we will use the result of a field as the first argument in a modifier, however, sometimes we would need to inject it at a different position.
We can use the placeholde _
for this.
We can demonstrate with an explicit implementation of counting down line numbers (which of course is done with the shortcuts shown above
iex(43)> run(["", "", ""], "%n(- 10 _)")
["10", "9", "8"]
## Cookbooks
### File Trees
#### Duplicate a directory structure
We have a structure like that
src
|
+ namespace_1
| |
| +---- file1.json, ...., filen.json
|
+ namespace_2
| |
| +---- file1.json, ...., filen.json
|
+ namespace_3
|
+---- file1.json, ...., filen.json
And want to create a structure like that
tests
|
json_tests
|
+ namespace_1
| |
| +---- file1_test.exs, ...., filen_test.exs
|
+ namespace_2
| |
| +---- file1_test.exs, ...., filen_test.exs
|
+ namespace_3
|
+---- file1_test.exs, ...., filen_test.exs
By using the output from, e.g. git ls-files src
or find src -name '*.json'
we can map the input with the following pattern
iex(44)> input = ~W[ src/namespace_1/file1.json src/namespace_2/file1.json src/namespace_2/file2.json ]
...(44)> run(input, "mkdir -p tests/json_tests/%(segment 1); touch tests/json_tests/%(segments 1 2)(sub '.json' '_test.exs')")
[
"mkdir -p tests/json_tests/namespace_1; touch tests/json_tests/namespace_1/file1_test.exs",
"mkdir -p tests/json_tests/namespace_2; touch tests/json_tests/namespace_2/file1_test.exs",
"mkdir -p tests/json_tests/namespace_2; touch tests/json_tests/namespace_2/file2_test.exs"
]
Changing and using extensions of filenames is frequent enough to justify the ext
builtin
iex(45)> input = ~W[ src/namespace_1/file1.json ]
...(45)> run(input, "mkdir -p tests/json_tests/%(segment 1); touch tests/json_tests/%(segments 1 2)(ext _test.exs)")
[
"mkdir -p tests/json_tests/namespace_1; touch tests/json_tests/namespace_1/file1_test.exs",
]
## Comprehensive List of Builtin Functions
We use B
as a shortcut to CliexMap.Builtins
, we also pass in a CliexMap.Context
(aliased to C
)
only if needed
### abs
iex(46)> B.abs(nil, [-10])
10
### add (used for +
)
iex(47)> B._add(nil, [1, 2, 3, 4])
10
### at (alias to Enum.at)
iex(48)> B.at(nil, [~w[a b c], 2])
"c"
iex(49)> B.at(nil, [~w[a b c], -3])
"a"
iex(50)> B.at(nil, [~w[a b c], 1, :default])
"b"
iex(51)> B.at(nil, [~w[a b c], 3, :default])
:default
### bn (Path.basename)
iex(52)> B.bn(nil, ~W[a/b])
"b"
iex(53)> B.bn(nil, ~W[b.ext])
"b.ext"
iex(54)> B.bn(nil, ~W[a/c/b.ext])
"b.ext"
### div (used for /
)
iex(55)> B._div(nil, [9, 4])
2.25
### dn (Path.dirname)
iex(56)> B.dn(nil, ~W[a/b])
"a"
iex(57)> B.dn(nil, ~W[b.ext])
"."
iex(58)> B.dn(nil, ~W[a/c/b.ext])
"a/c"
### downcase
iex(59)> B.downcase(nil, ["HELLO"])
"hello"
### ext/1 Returns last extension of a filename
iex(60)> B.ext(nil, ["a.html.eex"])
"eex"
If no extension is found it returns an empty string
iex(61)> B.ext(nil, ["a"])
""
### ext/2
Changes last extension of a filename, N.B. that the .
is not added automatically
iex(62)> B.ext(nil, ["a.html.erb", ".eex"])
"a.html.eex"
### idiv (used for :
)
iex(63)> B._idiv(nil, [9, 4])
2
### join
iex(64)> B.join(nil, [[1, 2]])
"12"
iex(65)> B.join(nil, [[1, 2, 3], " "])
"1 2 3"
### lpad
iex(66)> B.lpad(nil, ["a", 2])
" a"
iex(67)> B.lpad(nil, ["a", 3, :"-*"])
"-*a"
### mul
iex(68)> B._mul(nil, [1, 2, 3, 4])
24
### reverse
iex(69)> B.reverse(nil, ["alpha"])
"ahpla"
### revlist
iex(70)> B.revlist(nil, [~W[a b]])
~W[b a]
### rpad
iex(71)> B.rpad(nil, ["a", 2])
"a "
iex(72)> B.rpad(nil, ["a", 3, :"--"])
"a--"
### segment Extract a path
with -1 like
basename
iex(73)> B.segment(nil, ["a/b/c", -1]) "c"without an arg like
dirname
iex(74)> B.segment(nil, ["a/b/c"]) "a/b"or any part
iex(75)> B.segment(nil, ["a/b/c", 0]) "a"
iex(76)> B.segment(nil, ["a/b/c", -2]) "b"
### segments (short for (splicej "/" _1 _2)
)
iex(77)> B.segments(nil, ["x/y/z", 2])
"z"
iex(78)> B.segments(nil, ["x/y/z", -2, -1])
"y/z"
### slice
#### lists
iex(79)> B.slice(nil, [~w[a b c d], 1])
~w[b c d]
iex(80)> B.slice(nil, [~w[a b c d], 2, 3])
~w[c d]
iex(81)> B.slice(nil, [~w[a b c d], -2])
~w[c d]
iex(82)> B.slice(nil, [~w[a b c d], -4, -2])
~w[a b c]
iex(83)> B.slice(nil, [~w[a b c d], -4, 1])
~w[a b]
iex(84)> B.slice(nil, [~w[a b c d], 1, 0])
[]
#### strings
iex(85)> B.slice(nil, ["abc", 1])
"bc"
iex(86)> B.slice(nil, ["abc", -1])
"c"
iex(87)> B.slice(nil, ["abc", 1, 2])
"bc"
iex(88)> B.slice(nil, ["abcd", -2, -1])
"cd"
iex(89)> B.slice(nil, ["abcd", -2, 3])
"cd"
iex(90)> B.slice(nil, ["abc", 0, -2])
"ab"
#### safe
- First index out of bounds
iex(91)> B.slice(nil, ["abc", 4])
""
iex(92)> B.slice(nil, [~w[a b], 2])
[]
- Last index out of bounds
iex(93)> B.slice(nil, ["abc", 2, 5])
"c"
iex(94)> B.slice(nil, [~w[a b], 1, 4])
~w[b]
### splicej (or splice_join) split, splice and join
join on original seperator
iex(95)> B.splicej(nil, ["a/b/c/d/e", "/", 2, 3]) "c/d"
join on original seperator, slice to the end
iex(96)> B.splicej(nil, ["a/b/c/d/e", "/", 2]) "c/d/e"
join with different string
iex(97)> B.splice_join(nil, ["a/b/c/d/e", "/", 2, 3, ","]) "c,d"
### split
iex(98)> B.split(nil, ["a b"])
["a", "b"]
iex(99)> B.split(nil, ["a,b", ","])
["a", "b"]
### sub (from -
) (use with care)
iex(100)> B._sub(nil, [1, 2, 3, 4])
-8
### sub (from sub
)
iex(101)> B.sub(nil, ["Hello World", "l"])
"Heo Word"
iex(102)> B.sub(nil, ["Hello World", "l", "L"])
"HeLLo WorLd"
### to_s (for numbers only)
iex(103)> B.to_s(nil, [12])
"12"
iex(104)> B.to_s(nil, [12, 16])
"c"
### upcase
iex(105)> B.upcase(nil, ["hello"])
"HELLO"