Currently the HASH function is implemented as a "User Defined Function" within DataStage. This functionality was developed as-is so it's not working when trying to optimize the job with BalOpt.
DataStage development team took the Hash code provided by Netezza team and compiled it to create a shared library. That shared library was registered as User Defined Function within DataStage and started being used. Some changes were made in the code provided by Netezza to adhere to PX requirements.
As an User Defined Function, this allow the user to have similar Hash use within DataStage job design and not possible to use within Balance Optimizer or any other purpose.
|Who would benefit from this IDEA?||Almost all of our processes use HASH Function, in order to elaborate efficient distribution keys. We have more than 3000 processes, so we can’t manage manual changes on them|
How should it work?
Datastage HASH function should be included in standard functions, also it must produce the same result than Netezza HASH8 function.
Also, Balop should convert datastage HASH function in Netezza HASH8 function or in other database HASH functions.
Nowadays we are resolving this by overwritten queries in Netezza target connectors. We would like to use optimized and non-optimized Jobs depending on the infrastructure without needing to manually change any of them.
NOTICE TO EU RESIDENTS: per EU Data Protection Policy, if you wish to remove your personal information from the IBM ideas portal, please login to the ideas portal using your previously registered information then change your email to "email@example.com" and first name to "anonymous" and last name to "anonymous". This will ensure that IBM will not send any emails to you about all idea submissions