MaskTools v2

Abstract

Authors : Kurosu, Manao, mg262
Version : 2.0 alpha 27
Download : http://manao4.free.fr/
Category : Misc Plugins
Requirements : YV12 Colorspace

Introduction
Common parameters
Filters list
Filters description
Reverse polish notation
Changelog

I) Introduction

Masktool's dll contains a set of filters designed to create, manipulate and use masks. Masks, in video processing, are a way to give a relative importance to each pixels. You can, for example, create a mask that select only the green parts of the video, and then replace these parts with another video.

To give the most control over the handling of masks, the filters will use the fact that each luma and chroma planes can be uncorrelated. That mean that a single video will always be considered by the filters as 3 independant planes. That applies for masks as well, which means that a mask clip will in fact contain 3 masks, one for each planes.

The filters have a set of common parameters, that mainly concerns what processing to do on each planes. They all work only in YV12 ( though with Avisynth 2.6, support for all planar format will be available ).

II) Common parameters

As said previously, all the filters - except the helpers - share a common set of parameters. These parameters are used to tell what processing to do on each plane / channel, and what area of the video to process.

int "offx" (0), int "offy" (0)

"offx" and offy are the top left coordinates of the box where the actual processing shall occur. Everything outside that box will be garbage.
int "w" (-1), int "h" (-1)

w and h are the width and height of the processed box. -1 means that the box extends to the lower right corner of the video. That also means that default settings are meant to process the whole picture.
int "y" (3), int "u" (1), int "v" (1)

These three values describe the actual processing mode that is to be used on each plane / channel. Here is how the modes are coded :
- x=-255..0 : all the pixels of the plane will be set to -x.
- x=1 : the plane will not be processed. That means the content of the plane after the filter is pure garbage.
- x=2 : the plane of the first input clip will be copied.
- x=3 : the plane will be processed with the processing the filter is designed to do.
- x=4 (when applicable) : the plane of the second input clip will be copied.
- x=5 (when applicable) : the plane of the third input clip will be copied.
As you can see, defaults parameters are chosen to only process the luma, and not to care about the chroma. It's because most video processing doesn't touch the chroma when handling 4:2:0.
string "chroma" ("")

When defined, the value contained in this string will overwrite the u & v processing modes. This is a nice addition proposed by mg262 that makes the filter more user friendly. Allowed values for chroma are :
- "process" : set u = v = 3.
- "copy" or "copy first" : set u = v = 2.
- "copy second" : set u = v = 4.
- "copy third" : set u = v = 5.
- "xxx", where xxx is a number : set u = v = -xxx.

III) Filters list

Here is an exhaustive list of the filters contained in this dll :

Masks creation :
- mt_edge : creates edge masks.
- mt_motion : creates motion masks.
Masks operation :
- mt_invert : inverses masks.
- mt_binarize : transforms soft masks into hard masks.
- mt_logic : combines masks using logic operators.
- mt_hysteresis : combines masks making the first one grows into the second.
Mask merging :
- mt_merge : merges two clips according to a mask.
Morphologic operator :
- mt_expand : expands the mask / the video.
- mt_inpand : inpands the mask / the video.
- mt_inflate : inflates the mask / the video.
- mt_deflate : deflates the mask / the video.
Lut operator :
- mt_lut : applies an expression to all the pixels of a mask / video.
- mt_lutxy : applies an expression to all the pixels of two masks / videos.
- mt_lutf : creates a uniform picture from the collection of computation on pixels of two clips.
- mt_luts : applies an expression taking neighbouring pixels into.
Support operator :
- mt_makediff : substracts two clips.
- mt_adddiff : adds back a difference of two clips.
- mt_clamp : clamps a clip between two other clips.
- mt_average : averages two clips.
Convolutions :
- mt_convolution : applies a separable convolution on the picture.
- mt_mappedblur : applies a special 3x3 convolution on the picture.
Helpers :
- mt_square : creates a string describing a square.
- mt_rectange : creates a string describing a rectangle.
- mt_diamond : creates a string describing a diamond.
- mt_losange : creates a string describing a lozenge.
- mt_circle : creates a string describing a circle.
- mt_ellipse : creates a string describing an ellipse.
- mt_polish : creates a reverse polish expression from an infix one.

IV) Filters description

mt_edge

mt_edge : string mode("sobel"), int thY1(10), int thY2(10), int thC1(10), int thC2(10)

mode choses the 3x3 convolution kernel used for the mask computing. There are three predefined kernel, "sobel", "roberts" and "laplace", and you can enter also a 3x3 custom kernel. "sobel" uses the kernel "0 -1 0 -1 0 1 0 1 0", "roberts": "0 0 0 0 2 -1 0 -1 0" and "laplace": "1 1 1 1 -8 1 1 1 1". The normalization factor of the kernel is automatically computed and ceiled to the closest power of 2, to allow faster processing. You can specify your own normalization factor by adding it to the list of coefficients ( "1 1 1 1 -8 1 1 1 1 8" for example ).
thX1 is the low threshold and thX2 the high threshold. Under thX1, the pixel is set to zero, over thX2, to 255, and inbetween, left untouched.
Three new kernels have been introduced lately : "prewitt", "cartoon" and "min/max". "prewitt" is a more robust kernel, while "cartoon" behaves like "roberts", but takes only negative edges into account. Finally, "min/max" computes the local contrast ( local max - local min ).

mt_motion

mt_motion : int thY1(10), int thY2(10), int thC1(10), int thC2(10), int thT(10)

thT decides whether the frame is a scene change or not. The mask is made blank if a scene change is detected, else, the mask is computed.
thX1, thX2 work as for mt_edge.

mt_expand, mt_inpand

mt_xxpand : int thY(255), int thC(255), string mode("square")

It replaces the pixel by the local maximum/minimum.
thX allows to limit the maximum change.
mode select the local neighbourhood. It can take the values :
- "square" : 3x3 square neighbourhood - isse optimized.
- "horizontal" : 3x1 horizontal neighbourhood.
- "vertical" : 1x3 horizontal neighbourhood.
- "both" : a 3-long cross ( "horizontal" + "vertical" ) neighbourhood.
- a custom mode, where you give a list of coordinates. "0 0 -1 0 1 0" is for example equivalent to "horizontal".

mt_inflate, mt_deflate

mt_xxflate : int thY(255), int thC(255)

It computes a local average by taking into account only the neighbourgh whose value is higher/lower than the pixel.

mt_merge

mt_merge : clip clip1, clip clip2, clip mask, bool "luma"(false)

It's the backbone of the framework. It merges two clips according to the mask. The bigger the mask value, the more the second clip will be taken into account ( the actual formula is y = ((256 - m) * x1 + m * x2 + 128) / 256 )
luma is a special mode, where only the luma plane of the mask is used to process all three channels.
u and v are defaulted to 2 (that way, the resulting clip contains the chroma of clip1, and looks right).

mt_lut

mt_lut : string expr("x"), string yexpr("x"), string uexpr("x"), string vexpr("x")

It applies a function defined by expr to all the pixels. The function is written is reverse polish notation.
If yexpr, uexpr or vexpr isn't defined, expr is used instead.

mt_lutxy

mt_lutxy : clip clip1, clip clip2, string expr("x"), string yexpr("x"), string uexpr("x"), string vexpr("x")

It applies a two-parameters function defined by expr to all the pixels. The function is written is reverse polish notation.
If yexpr, uexpr or vexpr isn't defined, expr is used instead.

mt_lutf

mt_lutf : clip clip1, clip clip2, string mode("avg"), string expr("y"), string yexpr("y", string uexpr("y"), string vexpr("y")

It computes a value by collecting the values of the pixels of clip1, according to mode. Then it applies the function defined by the expressions to all the pixels of clip2 ( which are mapped to the y variable, while x is the collected value ).
mode can be :
- "avg" or "average" : computes the average of the values.
- "std" or "standard deviation" : computes the standard deviation of the values.
- "min" : computes the min of the values.
- "max" : computes the max of the values.
- "range" : computes "max" - "min".
- "med" or "median" : computes the median of the values.
A possible use is to allow increase the dynamic adaptively : mt_lutf(c, c, mode = "range", expr = "y 128 - 256 * range / 128 +")

mt_luts

mt_luts : clip clip1, clip clip2, string mode("avg"), string pixels(""), string expr("x"), string yexpr("x"), string uexpr("x"), string vexpr("x")

It computes the mode operation on the result of the function defined by expr, where x is the pixel from clip1, and y a pixel from the neighbourhood in clip2, defined by pixels.
mode can take the same values as for mt_lutf.
pixels is a coordinates list, relative to the current pixels. It can be created by one of the form helpers.
Let's see some uses :
- mt_luts( c, c, mode = "avg", pixels = mt_square( 1 ), expr = "y" ) does a convolution by a 3x3 kernel filled with ones.
- mt_luts( c, c, mode = "min", pixels = mt_square( 1 ), expr = "y" ) does an inpand.
- mt_luts( c, c, mode = "range", pixels = mt_square( 1 ), expr = "y" ) does a mt_edge( mode = "min/max" ).
- mt_luts( c, c, mode = "std", pixels = mt_square( 1 ), expr = "y" ) gives the local standard deviation of the clip.
- mt_luts( c, c, mode = "max", pixels = mt_square( 1 ), expr = "x y - abs" ) gives the maximum difference between the surrounding pixels and the center.
- mt_luts( c, c, mode = "med", pixels = mt_square( 1 ), expr = "y" ) gives the median of the pixels of the surrounding.

mt_average

mt_average : clip clip1, clip clip2

Equivalent to mt_lutxy("x y + 2 /"), but faster.

mt_makediff

mt_makediff : clip clip1, clip clip2

Equivalent to mt_lutxy("x y - 128 +"), but faster.

mt_adddiff

mt_adddiff : clip clip1, clip clip2

Equivalent to mt_lutxy("x y + 128 -"), but faster.

mt_clamp

mt_clamp : clip c, clip bright_limit, clip dark_limit, int overshoot(0), int undershoot(0)

Forces the value of the first clip to be between bright_limit + overshoot and dark_limit - undershoot.
Gives unwanted results if bright_limit + overshoot < dark_limit - undershoot.

mt_invert

mt_invert : clip c

Inverts the values of the pixels.
Equivalent to mt_lut("255 x -"), but faster.

mt_binarize

mt_binarize : clip c, int threshold(128), bool upper(false)

If upper is false, forces all values strictly over threshold to 0, and all others to 255.
Else, forces all values strictly over threshold to 255, else to 0.
upper = true is equivalent to mt_lut("x threshold > 0 255 ?"), but faster.
upper = false is equivalent to mt_lut("x threshold > 255 0 ?"), but faster.

mt_logic

mt_logic : clip clip1, clip clip2, string mode("and")

Applies the function defined by mode to clip1 and clip2.
Possible values for mode are :
- "and" : does a binary "and" on each pairs of pixels ( 11 & 5 is computed by converting them to binary, and to and all the bits : 11 = 1011, 5 = 101, 11 & 5 = 1 ).
- "or" : does a binary "or" on each pairs of pixels ( 11 | 5 = 1011 | 101 = 1111 = 15 ).
- "xor" : does a binary "xor" on each pairs of pixels ( 11 ^ 5 = 1011 ^ 101 = 1110 = 14 ).
- "andn" : does a binary "and not" on each pairs of pixels ( 11 & ~5 = 1011 & ~101 = 1011 & 11111010 = 1010 = 10 ).
- "min" : gives the minimum of each pairs of pixels.
- "max" : gives the maximum of each pairs of pixels.

mt_hysteresis

mt_hysteresis : clip small_mask, clip big_mask

Grows the small mask into the big mask by connex components. That allows to build more robust edge masks.

mt_convolution

mt_convolution : clip c, string horizontal("1 1 1"), string vertical("1 1 1"), bool saturate(true), float total(1.0f)

Applies the convolution defined by the kernel horizontalT x vertical to the video.
Both horizontal and vertical must have an odd length.
The default normalization value is the sum of the absolute values of the coefficients of the kernel.
If saturate is true, the result of the convolution is clipped to [0..255], else the absolute value of the result is clipped to [0..255].
If total is defined, it overrides the default normalization value.
Computations occurs in float as soon as one element of horizontal or vertical is a float.

mt_mappedblur

mt_mappedblur : clip c, clip map, string kernel("1 1 1 1 1 1 1 1 1"), string mode("replace")

Applies the convolution kernel to the clip, but in a special way, according to mode :
- "replace" : if a pixel differs by more than map from the center pixel of the convolution, it is replaced by the center value.
- "dump" : if a pixel differs by more than map from the center pixel of the convolution, it is not taken into account.

mt_square, mt_circle, mt_diamond

mt_square : int radius(1), bool zero(true)
mt_circle : int radius(1), bool zero(true)
mt_diamond : int radius(1), bool zero(true)

Creates a relative coordinates list that can be used in luts, mt_expand and mt_inpand
zero decides whether the center of the form is included or not.

mt_rectangle, mt_ellipse, mt_losange

mt_rectangle : int hor_radius(1), int ver_radius(1), bool zero(true)
mt_ellipse : int hor_radius(1), int ver_radius(1), bool zero(true)
mt_losange : int hor_radius(1), int ver_radius(1), bool zero(true)

Creates a relative coordinates list that can be used in luts, mt_expand and mt_inpand
zero decides whether the center of the form is included or not.

mt_polish

mt_polish : string expr("x")

Creates a reverse polish expression from an infix one.

V) Reverse polish notation.

A lot of filters accept custom functions defined by an expression written in reverse polish notation. You may not be accustomed to this notation, so here are a few pointers :

The basic concept behind the notation is to write the operator / function after the arguments. Hence, "x + y" in infix notation becomes in reverse polish "x y +". "(3 + 5) * x" would become "3 5 + x *".
As you noticed in the last example, the great asset of the notation is that it doesn't need parenthesis. The expresion that would have been enclosed in parenthesis ( "3 + 5" ) is correctly computed, because we read the expression from left to right, and because when the "+" is encountered, its two operands are unmistakenly known.
The supported operators are : "+", "-", "*", "/", "%" ( modulo ) and "^" ( power )
The supported functions are : "sin", "cos", "tan", "asin", "acos", "atan", "exp", "log", "abs"
Making the assumption that a positive float is "true", and a negative one is "false", we can also define boolean operators : "&", "|", "&!" ( and not ), "�" ( xor ).
We can create boolean values with the following comparaison operators : "<", ">", "<=", ">=", "!=", "==".
The variable "x" and "y" ( when applicable ) contains the value of the pixel. It's an integer that ranges from 0 to 255.
The constant "pi" can be used.
Finally, there's a ternary operator : "?", which acts like a "if .. then .. else .."
All the computation are made on floats, and the final results is rounded to the nearest integer, in the range [0..255].
Throughout the whole documentation, you'll be able to found plenty of examples.

VI) Changelog

Alpha 27 :

fixed : mt_binarize asm optimizations that borked with some thresholds

Alpha 26 :

fixed : avs closing issue.

Alpha 25 :

added : new html documentation.
fixed : wrong frame issue.
fixed : mt_merge with luma=true.

Alpha 24 :

fixed : issues with MT.dll ( thanks tsp, Boulder, vanessam and all those who suffered the bug ).
fixed : check for YV12 colorspace, and report an error if it isn't ( thanks Boulder ).
speed up : median mode for luts ( once again, thanks to tsp ).

Alpha 23 :

fix & speed up : median mode, thanks to tsp's insightfull remark. Note to self : think less like a mathematician, and more like a programmer. Simpler, faster & not bugged.

Alpha 22 :

added : "med"/"median" mode to luts/lutf.
changed : luts doesn't necessarily consider the center pixel.
changed back : forms helpers prepends (0, 0).
changed : forms helpers now have a bool "zero" parameter, defaulted to true.
added : bool "luma" parameter to mt_merge, which makes it use the luma mask for all three planes, and which forces chroma modes to "process" ( u=v=3 ).

Alpha 21 :

fixed : two & three input clips filters where requesting wrong frames leading to ghost artefacts.

Alpha 20 :

fixed : huge bug preventing most filters from working.

Alpha 19 :

code refactoring.
fixed : bug with asm and width lower than 64.
fixed : doesn't prepend (0, 0) pixel to the forms helpers.
added : "min/max" mode to mt_edge. The edge value is local max - local min ( taken on a 3x3 square ).
added : mt_lutf : a frame lut, see the description above.
added : mt_luts : a spatial lut, see the description above.

Alpha 18 :

added : mt_makediff, mt_adddiff, mt_average and mt_clamp, ported from mg262's limitedsupport plugin. The asm code is his, though it has been ported to nasm. They respectively amount to MakeDiff, AddDiff, SimpleAverage and Clamp.
added : mt_edge : "prewitt" kernel, taken from mg262's Prewitt filter. Unlike mg262's filter, there's no multiplier ( it's always 1 ), but mt_edge's thresholds still apply. Results, and speed, are identical except for the borders, which are now filtered.
added : "chroma" parameter, taken from mg262's excellent idea. It's a string that, if used, overrides U and V values. It can be either "process", "copy", "copy first", "copy second" or a number. "copy" and "copy second" work alike.
added : vmToon-0.74, adapted to masktools 2.0.
added : LimitedSharpenFaster, with LimitedSupport functions imported into the masktools.

Alpha 17 :

changed : behavior of mt_edge with a custom kernel : the automatic normalization factor is now the sum of the absolute value of the coefficients, ceiled to the next power of two if that power is <= 128 ( else, it isn't ceiled ).
added : cartoon mode for mt_edge.
added : modified mfToon script, for masktools v2. mfToonLite's speed goes from 30 fps to 70 fps, mfToon from 4.5 to 6.5.

Alpha 16 :

fixed : some asm code used in invert, binarize and memset to a particular value. Bug made the first 8 pixels of the picture to be incorrect. Also, avoid another nasty issue that arise when cropping ( not my fault this time, though ).

Alpha 15 :

fixed : bugs from inflate & deflate ( thx you know you ).
reversed : inflate and deflate now match their masktools' v1 counterparts' behavior. ( if anybody used the new buggy one, let him speak quickly ).

Alpha 14 :

fixed : random crashes with some width and asm functions ( thx Didee ).

Alpha 13 :

fixed : mt_merge order swapped for mask operation ( no comment... ).

Alpha 12 :

fixed : bug with some width ( mod4 ) for the non processing mode ( != 1 or 3 ).
changed : mt_merge order swapped for mask operation.

Alpha 11 :

fixed : mt_convolution's multiple instanciation bug.

Alpha 10 :

fixed : offY was always set to offX.
fixed : offset quirk.
fixed : mt_convolution was crashing with floats.
changed : luts' equal operator is now equivalent to abs(x-y) < 0.000001.
added : bool saturate(true) parameter to mt_convolution.
added : float total(1.0) parameter to mt_convolution.

Alpha 9 :

fixed : mt_lut, mt_lutxy : even faster loading.
fixed : mt_convolution : negative coefficients were offseted by 1.
fixed : mt_convolution : division by zero if the sum of the coefficients was 0.

Alpha 8 :

fixed : mt_edge in custom mode wasn't working properly.
fixed : mt_edge in custom mode, optimized wasn't working properly either.
fixed : mt_lutxy was slow to load, it's better now.

Alpha 7 :

fixed : forgot to add functions to the parser. Thanks Didee for pointing that out.

Alpha 6 :

fixed : mt_polish was having some trouble with functions.

Alpha 5 :

added : helpers for creating string for inpand / expand custom modes :
- mt_circle
- mt_square
- mt_diamond
- mt_ellipse
- mt_rectangle
- mt_losange
added : helper for lut : consersion from infix to reverse polish notation :
- mt_polish

Alpha 4 :

added : custom modes for inpand / expand.

Alpha 3 :

Fixed : mt_invert, mt_binarize, mt_lutxy, which weren't working properly anymore.
Fixed : offset created by incorrect rounding in mt_convolution.
Fixed : mmx version of edges filters ( soft thresholding, and roberts ).
Fixed : mmx version of motion edge ( soft thresholding ).
added : mt_mappedblur.

Alpha 2 :

added functions to luts : sin, abs, cos, tan, exp, log, acos, atan, asin.
added "vertical", "horizontal" and "both" mode to mt_inpand / mt_expand.
added mt_convolution.
fixed mt_merge behavior for y, u, v = 2.
added y, u, v = 4, for masked merge : copy the second clip channel. It's worth for any two clips input filters.
internal changes ( code reorganization ).

Alpha 1 :

Original release.