IBM Data and AI

Welcome to the IBM Data and AI Ideas Portal for Clients!

We welcome and appreciate your feedback on IBM Data and AI Products to help make them even better than they are today!
Before you submit an idea, please perform a search first as a similar idea may have already been reported in the portal. If a related idea is not yet listed, please create a new idea and include with it a description which includes expected behavior as well as why having this feature would improve the service and how it would address your use case.
IBM Employees:
Clients:
  • Our team welcomes any feedback and suggestions you have for improving our offerings / products! This forum allows us to connect your offering / product improvement ideas with IBM product and engineering teams.

  • If you have not registered on this portal please click on the following link and register. To complete registration you will need to open the email you will receive from Aha to confirm your identity. http://ibm.biz/IBM-Data-and-AI-Portal-Register

Additional Information:
  • The shorter URL for this site is: https://ibm.biz/IBM-Data-and-AI-Ideas

  • To view our roadmaps: http://ibm.biz/Data-and-AI-Roadmaps

  • Reminder: This is not the place to submit defects or support needs, please use normal support channel for these cases

  • Please do not use the Ideas Portal for reporting bugs - we ask that you report bugs or issues with the product by contacting IBM support.

Improve API response structure for timestamps and word_confidence within STT SpeechRecognitionAlternative model

As part of the response from making  a POST to the v1/recognize endpoint in the Speech to Text service, the user receives an array of "alternatives". Within these "alternatives" objects, there are two arrays called "word_confidence" and "transcript". Below is an example of this piece of the response:

"alternatives": [
{
"transcript": "thunderstorms could produce",
    "confidence": 0.994,
    "word_confidence": [
    [
      "thunderstorms",
        1
    ],
[
      "could",
       1
     ],
      [
      "produce",
        1
      ],
   ],
   "timestamps": [
    [
"thunderstorms",
       1.49,
        2.32
     ],
      [
      "could",
        2.32,
       2.54
      ],
      [
      "produce",
        2.54,
        3.01
     ],
    ]
  }
]

The problem with this response is that both "word_confidence" and "timestamps" are arrays of arrays, even though the internal arrays have specific attributes at particular indices. As is, this response lends itself more to "word_confidence" and "timestamps" being arrays of objects like so:

"alternatives": [
{
"transcript": "thunderstorms could produce",
    "confidence": 0.994,
    "word_confidence": [
    {
      "thunderstorms",
        1
    },
{
      "could",
       1
     },
      {
      "produce",
        1
      },
   ],
   "timestamps": [
    {
"thunderstorms",
       1.49,
        2.32
     },
      {
      "could",
        2.32,
       2.54
      },
      {
      "produce",
        2.54,
        3.01
     },
    ]
  }
]

Added motivation for the change is that this is currently causing a problem with the API specification and, in turn, efforts to automatically generate code/documentation based on the API.

 

Currently, we are specifying the various Watson APIs based on the OpenAPI version 2.0 specification. These can be found here. The way the response is actually structured, with a nested array of mixed types, cannot be properly specified according to the specification, resulting in manual work to tweak code that may be generated from the Speech to Text API spec.

 

Overall, making the proposed change to the v1/recognize response in the Speech to Text service would both make more sense based on the structure, but would also make it easier to document and help further the effort to streamline API changes across all relevant tools.

  • Avatar32.5fb70cce7410889e661286fd7f1897de Guest
  • Jan 31 2018
  • Planned for future release
Who would benefit from this IDEA? Customers will benefit from this change by having a more natural API response to work with. They will also secondarily benefit from this change making things easier from a development standpoint. With the proposed changes, less work would have to be done to do any post-automation tweaks, allowing for quicker response to API changes for customer-facing tools.
  • Attach files

NOTICE TO EU RESIDENTS: per EU Data Protection Policy, if you wish to remove your personal information from the IBM ideas portal, please login to the ideas portal using your previously registered information then change your email to "anonymous@euprivacy.out" and first name to "anonymous" and last name to "anonymous". This will ensure that IBM will not send any emails to you about all idea submissions