Given that model can be affected by the set of data used for training it, it would be nice to have better control over how the data is split up.
A fairly easy improvement is to allow the user to set a seed for controlling the splits, so you can replicate the split in the future.
A more complex change is to allow the user to specify how may different variations of the train/test split to generate and run models against, either combining the results into an ensemble or selecting the best of the variations as the "final" result.
NOTICE TO EU RESIDENTS: per EU Data Protection Policy, if you wish to remove your personal information from the IBM ideas portal, please login to the ideas portal using your previously registered information then change your email to "firstname.lastname@example.org" and first name to "anonymous" and last name to "anonymous". This will ensure that IBM will not send any emails to you about all idea submissions