We have a situation where it is important to keep the data in synch on both accelerators, so we use one statement to load both accelerators.
Real case: The data is loaded from OPERLOG logstream to shadow table every 15 minutes on two IDAA accelerators. To avoid loading duplicate data, we only load the newest rows that have been written to the logstream since the last time the load ran. To ensure this, we select the max timestamp from table in IDAA and then load the rows from the logstream that are greater than the max timestamp into IDAA. This works well unless the load to one accelerator fails and the other is successful. Then the max timestamps aren't the same on both accelerators and there is potential for either a gap or duplicates to occur next time the loads run, since we aren't guaranteed which accelerator the job will retrieve the max timestamp from.
|Who would benefit from this IDEA?||Having a reliable max timestamp to trust will prevent potential duplicates or gaps of missing data when loading to IDAA|
How should it work?
Provide an option so that if one of the loads fails, the overall job fails. The changes should be rolled back from the successful load to keep the two accelerators in synch. It needs to be flexible enough that the option can be turned off if necessary. If an accelerator is down for an extended period of time, the customer may want the option of continuing to load one accelerator.
|Customer Name||Mark Kaiser|
NOTICE TO EU RESIDENTS: per EU Data Protection Policy, if you wish to remove your personal information from the IBM ideas portal, please login to the ideas portal using your previously registered information then change your email to "email@example.com" and first name to "anonymous" and last name to "anonymous". This will ensure that IBM will not send any emails to you about all idea submissions