IBM Data and AI

Welcome to the IBM Data and AI Ideas Portal for Clients!

We welcome and appreciate your feedback on IBM Data and AI Products to help make them even better than they are today!
Before you submit an idea, please perform a search first as a similar idea may have already been reported in the portal. If a related idea is not yet listed, please create a new idea and include with it a description which includes expected behavior as well as why having this feature would improve the service and how it would address your use case.
IBM Employees:
Clients:
  • Our team welcomes any feedback and suggestions you have for improving our offerings / products! This forum allows us to connect your offering / product improvement ideas with IBM product and engineering teams.

  • If you have not registered on this portal please click on the following link and register. To complete registration you will need to open the email you will receive from Aha to confirm your identity. http://ibm.biz/IBM-Data-and-AI-Portal-Register

Additional Information:
  • The shorter URL for this site is: https://ibm.biz/IBM-Data-and-AI-Ideas

  • To view our roadmaps: http://ibm.biz/Data-and-AI-Roadmaps

  • Reminder: This is not the place to submit defects or support needs, please use normal support channel for these cases

  • Please do not use the Ideas Portal for reporting bugs - we ask that you report bugs or issues with the product by contacting IBM support.

Setting Seeds

I want to do model testing while always drawing the same training data with the help of a random variable (e.g. command NAIVEBAYES). It is important to always produce the same numbers. In SPSS I can do it with SET but there are pitfalls the user might not be aware of:

 

* Method 1a (Variables produce different values).
SET RNG=MT MTINDEX=2019.
COMPUTE Test1a = RV.NORMAL(100,10).
COMPUTE Test2a = RV.NORMAL(100,10).
COMPUTE Test3a = RV.NORMAL(100,10).
EXECUTE.

* Method 1b (works like Method 1a; 2nd and 3rd SET commands are being ignored).
SET RNG=MT MTINDEX=2019.
COMPUTE Test1b = RV.NORMAL(100,10).
SET RNG=MT MTINDEX=2019.
COMPUTE Test2b = RV.NORMAL(100,10).
SET RNG=MT MTINDEX=2019.
COMPUTE Test3b = RV.NORMAL(100,10).
EXECUTE.

* Method 2 (variables produce exactly the same values).
SET RNG=MT MTINDEX=2019.
COMPUTE Test4 = RV.NORMAL(100,10).
EXECUTE.
SET RNG=MT MTINDEX=2019.
COMPUTE Test5 = RV.NORMAL(100,10).
EXECUTE.
SET RNG=MT MTINDEX=2019.
COMPUTE Test6 = RV.NORMAL(100,10).
EXECUTE.

* Method 3 (variables are transposes of test1b to test3b).
SET RNG=MT MTINDEX=2019.
COMPUTE TestVar7 = RV.NORMAL(100,10).
EXECUTE.
COMPUTE TestVar8 = RV.NORMAL(100,10).
EXECUTE.
COMPUTE TestVar9 = RV.NORMAL(100,10).
EXECUTE.

 

It seems like this is one of those rare examples where an immediate EXECUTE is needed.

  • Avatar32.5fb70cce7410889e661286fd7f1897de Guest
  • Dec 18 2019
  • Future consideration
Who would benefit from this IDEA? It's no big deal - just an overdue clarification in the reference document.
How should it work?

Method 2 does what is needed. However, the syntax reference guide does not say anything about these variants of setting a seed and the effects they have. It would be worthwhile to explain it in more detail. Kirill Orlov did good work on this in the SPSS newsgroup: https://listserv.uga.edu/cgi-bin/wa?A2=ind1912&L=SPSSX-L&X=E7796C2500D9A6B6E6&Y=spss.giesel%40yahoo.de&P=28072 

Idea Priority High
Priority Justification Customers are not aware that there are pitfalls, analysis results may be wrong.
Customer Name Mario Giesel, Mediaplus Munich, Germany
  • Attach files

NOTICE TO EU RESIDENTS: per EU Data Protection Policy, if you wish to remove your personal information from the IBM ideas portal, please login to the ideas portal using your previously registered information then change your email to "anonymous@euprivacy.out" and first name to "anonymous" and last name to "anonymous". This will ensure that IBM will not send any emails to you about all idea submissions