View Source CliexMap (CliexMap v0.2.2)

It is a Unix filter cliex_map that applies a pattern to each input line. This is an oppinionated subset of sed with many built in features trimmed for file manipulation (move, rename, format, timestamps) field and substring extraction à la awk but much more concise and less powerful.

The design goal is to have a powerful minilanguage to describe patterns in one line mostly. We shall see where this journey shall lead ?)

Let us document how these patterns are compiled, and how the compiled form is rendered. Here is a pimitive example which demonstrates this:

iex(1)> run(~w[1 2], "Hello")
["Hello", "Hello"]

As a pattern always starts with %we can render literal %s by doubling them

iex(2)> run(~w[alpha], "%% %%")
["% %"]

N.B. that end of lines are added if not present

Patterns

The patterns shown above are literals and represented as binaries in the compiled ast

Fields: % and %<integer>

The first pattern we are describing here is also the most used it corresponds to awk's $0, $1 and so forth but there is a little bit more here.

  1. Instead of $ we use % (for simpler shell integration, inspored by xargs)

  2. We can write % as shortcut for %0 and we can write %-1, for awk's $NF %-2 for $(NF-1) and so on

  3. We can substring, format and modify %f<n> as all other patterns, which will show in the corresponding sections

Here is a simple echo pattern with and without the explicit 0

iex(3)> run(~w[alpha beta], "%")
~w[alpha beta]

N.B. that M is a shortcut for CliexMap.Modifiers defined in the doctest for readability

And here the (double) echo with an explicit 0

iex(4)> run(~w[alpha], "% %0")
["alpha alpha"]

If there is only one field than %1 is, again, the synonyme of %

iex(5)> run(["alpha", "beta gamma"], "% %1")
["alpha alpha", "beta gamma beta"]

Now let us show how negative indices and indices of non existing fields are rendered

iex(6)> sentence = "The quick brown fox jumps"
...(6)> run([sentence], "%-1 %-2 '%6'")
["jumps fox ''"]

Line Numbers: %n

This is rendered like awk's $NR - 1

iex(7)> run(["", "", "ignored"], "%n")
~w[0 1 2]

Timestamps: %ts, %tms, %tmics, "%xs", %xms and %xmics

These fields are rendererd with the unix timestamp in seconds, milliseconds or microseconds with the same timestamp for all input lines, tusly mapping it for each line to a constant value.

iex(8)> run([""], "%ts", C.for_now(1691231907123456))
["1691231907"]

N.B. that C is a shortcut for CliexMap.Context defined in the doctest for readability and the context will be generated when the pattern is compiled. As a consequence %ts and friends will all be rendered itentically for each input line, here are examples for the other 5 timestamp formats.

iex(9)> run([""], "%tms", C.for_now(1691231907123456))
["1691231907123"]

iex(10)> run(["", "", ""], "%tmics", C.for_now(1691231907123456))
["1691231907123456", "1691231907123456", "1691231907123456"]

iex(11)> run([""], "%xs", C.for_now(1691231907123456))
["64ce26a3"]

iex(12)> run([""], "%xms", C.for_now(1691231907123456))
["189c546ed33"]

iex(13)> run(["", "", ""], "%xmics", C.for_now(1691231907123456))
["6022a9d0e9100", "6022a9d0e9100", "6022a9d0e9100"]

Variables

While, up to here, patterns have been simple we will see later that they can become quite complicated. Without explaining the details of the following examples we will see that we need to rewrite a long pattern to create the desired command for each input line

iex(14)> run(["src/DIR/subdir/file.jsno"],
...(14)> ~s{mkdir -p bup/%xs/%(segments 1 -2)(downcase); cp % bup/%xs/%(segments 1 -2)(downcase)/%(segments -1)(sub ".jsno" ".json")},
...(14)> C.for_now(1691231907123456))
["mkdir -p bup/64ce26a3/dir/subdir; cp src/DIR/subdir/file.jsno bup/64ce26a3/dir/subdir/file.json"]

Actually the patterns above are a little bit more complicated as needed but this is to demonstrate the usage of Variables

Setting a variable with %S

The %S<varname><pattern> will compile <pattern> but instead of rendering its result it will store it in the context, thusly the following is useless

iex(15)> run(["useless input"], "%Saa<%(split)(revlist)(join)>")
[""]

We can, however use the variable after its declaration, and that, many times

iex(16)> run(["A B"], "%Sx<%(downcase)(split)(revlist)(join)>-%Lx-%Lx-%Lx-")
["-ba-ba-ba-"]

Of course a loaded variable can be modified as every other field

iex(17)> run(["A B"], "%Sx<%(downcase)(split)(revlist)(join)>-%Lx-%Lx(upcase)")
["-ba-BA"]

Variables can also be used to store constant strings

iex(18)> run([""], "%Sworld<Brave new world>%Lworld %Lworld(split)(at 1)")
["Brave new world new"]

Pattern Modifiers

All patterns can be modified by a little, lisp like minilanguage directly attached to the pattern. This LLLML is expressed in form of s-expressions with built in predefined functions.

The value of the field is injected into the first such s-expression, the result of which is injected into the second s-expression and so forth, here are some examples to demonstrate the principle

  iex(19)> run(["alpha"], "%(reverse)")
  ["ahpla"]

Below we will list all available built in functions, but here we will also show the more common use cases like splitting, extracting and combining segments of strings, formatting output and the shortcut syntax that allows us to write common used modifiers in an even shorter form than LLLML

Counting Modifiers and their shortcuts

Let us imagien that we want to preceede line numbers by their tenfolds and want to start with 10, what we can do of course is to use LLLML as follows

iex(20)> run(["a", "a", "a", "a"], "%n(+ 1)(* 10) %")
["10 a", "20 a", "30 a", "40 a"]

There are cases where we need the injected line number to be the second argument of a function, e.g. if we want to count down. We can indicate the position where a value shall be injected by the placeholder _

iex(21)> run(["b", "b", "b"], "%n(* -10)(+ 110) %")
["110 b", "100 b", "90 b"]

This is obviously a handful, and therefore we provide a shortcut with a simple value notation

iex(22)> run(["b", "b", "b"], "%n:110,-10: %")
["110 b", "100 b", "90 b"]

And the one value version can be used to change the start count, 1 being a popular choice ;)

iex(23)> run(["b", "b", "e"], "%n:1: %")
["1 b", "2 b", "3 e"]

Formatting Modifiers and their shortcuts

When inserting strings or numbers into the output we oftentimes need to align them to a length, we also sometimes want to pad them with spaces, zeroes or different characters. This all, of course, can be done with LLLML but as we will show here, shorter patterns are desired and therefore implemented.

Aligning strings and numbers

iex(24)> run(~w[alpha beta], "%(rpad 6)")
["alpha ", "beta  "]

And the shorter

iex(25)> run(~w[alpha beta], "%<-6>")
["alpha ", "beta  "]

Or

iex(26)> run(~w[alpha beta], "%(lpad 6 -)")
["-alpha", "--beta"]

And the shorter

iex(27)> run(~w[alpha beta], "%<6->")
["-alpha", "--beta"]


iex(28)> run(["", ""], "%n<2 0>")  # Note the space in order to avoid a length of 20
["00", "01"]

Formatting Numbers

iex(29)> run(["", ""], "%n:15:(to_s 16)(lpad 2 0)")
["0f", "10"]

And the shorter

iex(30)> run(["", ""], "%n:15:<2x0>")
["0f", "10"]

Parameter Positions

Oftentimes we will use the result of a field as the first argument in a modifier, however, sometimes we would need to inject it at a different position.

We can use the placeholde _ for this.

We can demonstrate with an explicit implementation of counting down line numbers (which of course is done with the shortcuts shown above

iex(31)> run(["", "", ""], "%n(- 10 _)")
["10", "9", "8"]

Cookbooks

File Trees

Duplicate a directory structure

We have a structure like that

  src
    |
    + namespace_1
    |  |
    |  +---- file1.json, ...., filen.json
    |
    + namespace_2
    |  |
    |  +---- file1.json, ...., filen.json
    |
    + namespace_3
       |
       +---- file1.json, ...., filen.json

And want to create a structure like that

 tests
   |
   json_tests
     |
     + namespace_1
     |  |
     |  +---- file1_test.exs, ...., filen_test.exs
     |
     + namespace_2
     |  |
     |  +---- file1_test.exs, ...., filen_test.exs
     |
     + namespace_3
       |
       +---- file1_test.exs, ...., filen_test.exs

By using the output from, e.g. git ls-files src or find src -name '*.json' we can map the input with the following pattern

  iex(32)> input = ~W[ src/namespace_1/file1.json src/namespace_2/file1.json src/namespace_2/file2.json ]
  ...(32)> run(input, "mkdir -p tests/json_tests/%(segment 1); touch tests/json_tests/%(segments 1 2)(sub '.json' '_test.exs')")
  [
    "mkdir -p tests/json_tests/namespace_1; touch tests/json_tests/namespace_1/file1_test.exs",
    "mkdir -p tests/json_tests/namespace_2; touch tests/json_tests/namespace_2/file1_test.exs",
    "mkdir -p tests/json_tests/namespace_2; touch tests/json_tests/namespace_2/file2_test.exs"
  ]

Changing and using extensions of filenames is frequent enough to justify the ext builtin

  iex(33)> input = ~W[ src/namespace_1/file1.json ]
  ...(33)> run(input, "mkdir -p tests/json_tests/%(segment 1); touch tests/json_tests/%(segments 1 2)(ext _test.exs)")
  [
    "mkdir -p tests/json_tests/namespace_1; touch tests/json_tests/namespace_1/file1_test.exs",
  ]

Comprehensive List of Builtin Functions

We use B as a shortcut to CliexMap.Builtins, we also pass in a CliexMap.Context (aliased to C) only if needed

abs

iex(34)> B.abs(nil, [-10])
10

add (used for +)

iex(35)> B._add(nil, [1, 2, 3, 4])
10

at (alias to Enum.at)

iex(36)> B.at(nil, [~w[a b c], 2])
"c"

iex(37)> B.at(nil, [~w[a b c], -3])
"a"

iex(38)> B.at(nil, [~w[a b c], 1, :default])
"b"

iex(39)> B.at(nil, [~w[a b c], 3, :default])
:default 

div (used for /)

iex(40)> B._div(nil, [9, 4])
2.25

downcase

iex(41)> B.downcase(nil, ["HELLO"])
"hello"

ext/1

Returns last extension of a filename

iex(42)> B.ext(nil, ["a.html.eex"])
"eex"

If no extension is found it returns an empty string

iex(43)> B.ext(nil, ["a"])
""

ext/2

Changes last extension of a filename, N.B. that the . is not added automatically

iex(44)> B.ext(nil, ["a.html.erb", ".eex"])
"a.html.eex"

idiv (used for :)

iex(45)> B._idiv(nil, [9, 4])
2

join

iex(46)> B.join(nil, [[1, 2]])
"12"

iex(47)> B.join(nil, [[1, 2, 3], " "])
"1 2 3"

lpad

iex(48)> B.lpad(nil, ["a", 2]) " a"

iex(49)> B.lpad(nil, ["a", 3, :"-"]) "-a"

mul

iex(50)> B._mul(nil, [1, 2, 3, 4])
24

reverse

iex(51)> B.reverse(nil, ["alpha"])
"ahpla"

revlist

iex(52)> B.revlist(nil, [~W[a b]])
~W[b a]

rpad

iex(53)> B.rpad(nil, ["a", 2]) "a "

iex(54)> B.rpad(nil, ["a", 3, :"--"]) "a--"

segment Extract a path

  • with -1 like basename iex(55)> B.segment(nil, ["a/b/c", -1]) "c"

  • without an arg like dirname iex(56)> B.segment(nil, ["a/b/c"]) "a/b"

  • or any part

    iex(57)> B.segment(nil, ["a/b/c", 0]) "a"

    iex(58)> B.segment(nil, ["a/b/c", -2]) "b"

segments (short for (splicej "/" _1 _2))

iex(59)> B.segments(nil, ["x/y/z", 2])
"z"

iex(60)> B.segments(nil, ["x/y/z", -2, -1])
"y/z"

slice

lists

iex(61)> B.slice(nil, [~w[a b c d], 1])
~w[b c d]

iex(62)> B.slice(nil, [~w[a b c d], 2, 3])
~w[c d]

iex(63)> B.slice(nil, [~w[a b c d], -2])
~w[c d]

iex(64)> B.slice(nil, [~w[a b c d], -4, -2])
~w[a b c]

iex(65)> B.slice(nil, [~w[a b c d], -4, 1])
~w[a b]

iex(66)> B.slice(nil, [~w[a b c d], 1, 0])
[]

strings

iex(67)> B.slice(nil, ["abc", 1])
"bc"

iex(68)> B.slice(nil, ["abc", -1])
"c"

iex(69)> B.slice(nil, ["abc", 1, 2])
"bc"

iex(70)> B.slice(nil, ["abcd", -2, -1])
"cd"

iex(71)> B.slice(nil, ["abcd", -2, 3])
"cd"

iex(72)> B.slice(nil, ["abc", 0, -2])
"ab"

splicej (or splice_join) split, splice and join

  • join on original seperator

    iex(73)> B.splicej(nil, ["a/b/c/d/e", "/", 2, 3])
    "c/d"
  • join on original seperator, slice to the end

    iex(74)> B.splicej(nil, ["a/b/c/d/e", "/", 2])
    "c/d/e"
  • join with different string

    iex(75)> B.splice_join(nil, ["a/b/c/d/e", "/", 2, 3, ","])
    "c,d"

split

iex(76)> B.split(nil, ["a b"])
["a", "b"]

iex(77)> B.split(nil, ["a,b", ","])
["a", "b"]

sub (from -) (use with care)

iex(78)> B._sub(nil, [1, 2, 3, 4])
-8

sub (from sub)

iex(79)> B.sub(nil, ["Hello World", "l"])
"Heo Word"

iex(80)> B.sub(nil, ["Hello World", "l", "L"])
"HeLLo WorLd"

to_s (for numbers only)

iex(81)> B.to_s(nil, [12])
"12"

iex(82)> B.to_s(nil, [12, 16])
"c"

upcase

iex(83)> B.upcase(nil, ["hello"])
"HELLO"

Summary

Functions

Link to this function

run(input_stream, pattern, context \\ nil)

View Source