Watson Discovery uses stemming to identify valid matches. Based on our experience it appears that stemming is applied in both Natural Language Queries and also Discovery Language Queries. It seems to also enforced in phrase searching. This is useful sometimes but there are other cases where this approach makes it very difficult to find relevant matches. For example, in one of our collections we have several documents with the term "DCS" and several with the term "DC". These are not related terms and there is no reason for our users to get both in a single query. However, if you search for either DCS or DC, you will get both and there is no way to filter our the undesired matches because they seem to be interpreted by WDS as the exact same term.
We think WDS should not enforce stemming when using phrase searching, as this type of search is primarily used to specify exact matches such as titles or excerpts of a document. Alternatively, we would like to have some control over when stemming should be applied or what words or terms should not be stemmed.
|Who would benefit from this IDEA?||"As a customer I would like to be able to do searches that only retrieve exact matches|
|IBM's success depends on gathering feedback from customers like yourself. Aha Ideas Portal is the third party tool through which IBM Offering Managers gather feedback from customers such as yourself.|
|IBM is a global organization with business processes, management structures, technical systems and service provider networks that cross borders. As such, the information collected through Aha Ideas Portal (Customer Name, Customer Email Address) will be stored by them in the United States, and handled only as per IBM's instructions and policies. Your data (Name and Email Address) will NOT be shared with other IBM customers.|
|In order to safeguard your information in Aha, do not leave your workstation unattended while using this application, log off after using it, and print only if necessary. If you need to make a hardcopy, remember to pick up the print-out immediately, keep it under lock, and destroy it immediately when no longer needed.|
|NOTICE TO EU RESIDENTS: per EU Data Protection Policy, if you wish to remove your personal information from the IBM ideas portal, please login to the ideas portal using your previously registered information then change your email to "email@example.com" and first name to "anonymous" and last name to "anonymous". This will ensure that IBM will not send any emails to you about all idea submissions|