Option to exclude NULL (empty rows) from column analysis Information Analyzer
Real life use case:
Information Analyzer is used to profile data using Column Analysis with Data Sample of given size ( for example 10k or 100k rows).
Analyzed column have 20 milions rows in total: lets assume 18 milion rows are empty (NULL), and 2 milion do have meaningfull values.
Currently empty values (NULLs) are collected into Data Sample, as well as real values. There is no way to exclude NULLs from being collected into data sample now.
So selecting 10k or 100k rows from table above, without excluding NULL values, leads to situation that Data Sample MAY CONTAINS ONLY NULLs...
As this analysis are done on reall applications (2000 tables+), then any manual workaround is not applicable here (like creating virtual table for each analyzed column). Obviously, using very big Data Sample have negative performance impact.
Using SQL terminology, in Data Sample there should be an option to add "...WHERE IS NOT NULL; " to the statement collecting data from each of analysed columns.