US8108214B2 - System and method for recognizing proper names in dialog systems - Google Patents
System and method for recognizing proper names in dialog systems Download PDFInfo
- Publication number
- US8108214B2 US8108214B2 US12/274,267 US27426708A US8108214B2 US 8108214 B2 US8108214 B2 US 8108214B2 US 27426708 A US27426708 A US 27426708A US 8108214 B2 US8108214 B2 US 8108214B2
- Authority
- US
- United States
- Prior art keywords
- proper name
- name
- confidence score
- user
- response
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 230000004044 response Effects 0.000 claims description 38
- 238000012790 confirmation Methods 0.000 claims description 36
- 230000008569 process Effects 0.000 claims description 14
- 238000009472 formulation Methods 0.000 claims description 6
- 239000000203 mixture Substances 0.000 claims description 6
- 230000003068 static effect Effects 0.000 abstract description 6
- 230000000694 effects Effects 0.000 abstract description 3
- 230000002123 temporal effect Effects 0.000 abstract description 3
- 239000000306 component Substances 0.000 description 22
- 238000010586 diagram Methods 0.000 description 6
- 238000012546 transfer Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 3
- 238000005352 clarification Methods 0.000 description 3
- 238000012937 correction Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 239000004065 semiconductor Substances 0.000 description 3
- 229910052710 silicon Inorganic materials 0.000 description 3
- 239000010703 silicon Substances 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 239000002184 metal Substances 0.000 description 2
- 229910044991 metal oxide Inorganic materials 0.000 description 2
- 150000004706 metal oxides Chemical class 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000007670 refining Methods 0.000 description 2
- 230000004931 aggregating effect Effects 0.000 description 1
- 238000004873 anchoring Methods 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 230000009118 appropriate response Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 229920000547 conjugated polymer Polymers 0.000 description 1
- 239000008358 core component Substances 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011982 device technology Methods 0.000 description 1
- 230000005669 field effect Effects 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000004549 pulsed laser deposition Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
Definitions
- Embodiments of the invention relate generally to dialog systems, and more specifically to recognizing proper names in dialog systems.
- a dialog system is a computer system that is designed to converse with a human using a coherent structure and text, speech, graphics, or other modes of communication on both the input and output channel. Dialog systems that employ speech are referred to as spoken dialog systems and generally represent the most natural type of machine-man interface. With the ever-greater reliance on electronic devices, spoken dialog systems are increasingly being implemented in many different machines.
- proper names such as names of people, locations, companies, places, and similar things are very widely used. In fact, it is often the case that the number of proper names used in these applications is significantly large, and may involve foreign names, such as street names in a navigation domain or restaurant names in a restaurant selection domain. When used in high-stress environments, such as driving a car, flying a helicopter, or operating machinery, people tend to use short-hand terms, such as partial proper names and their slight variations.
- the present problems of proper name recognition in conventional spoken language interface applications include inadequate speech recognition accuracy in the speech recognizer component for these names, and inadequate recognition accuracy of these names with regard to the presence of these names in the system database.
- Present recognition systems may also be configured to confirm proper names by means of direct confirmation.
- the system responds to a question by rephrasing the user's utterance and directly mentioning the name or names, as they were understood by the system.
- One type of direct confirmation system explicitly asks the user whether he or she mentioned a specific name or names. For example, if the user is making an airplane reservation, he might say “I want to fly from Boston to New York”. The system may then respond by saying: “You said Boston to New York, is that correct?” The user must then answer that this was correct or incorrect and provide any correction necessary. In order to make the system seem more conversational, the confirmation may be restated in a less direct manner.
- FIG. 1 is a block diagram of a spoken dialog system that incorporates an improved proper name recognition unit, according to an embodiment.
- FIG. 2 is a block diagram that illustrates the components for generating an indirect confirmation statement, under an embodiment.
- FIG. 3 is a flowchart illustrating a method of generating an indirect confirmation statement, under an embodiment.
- FIG. 4 is a block diagram of the functional components of the dialog strategy component, under an embodiment.
- Embodiments of a dialog system that utilizes contextual information to perform recognition of proper names are described. Unlike present name recognition methods on large name lists that generally focus strictly on the static aspect of the names, embodiments of the present system take into account of the temporal, recency and context effect when names are used, and formulates new questions to further constrain the search space or grammar for recognition of the past and current utterances.
- the confidence level for proper name recognition is usually not very high, at least for certain names.
- systems have been developed to use certain contextual information, such as using knowledge of a specific domain or a user model.
- Embodiments of the proper name recognition system build and utilize the contextual information through the formulation of indirect confirmations that may be provided in the form of questions derived from user input in previous dialog turns.
- FIG. 1 is a block diagram of a spoken dialog system that incorporates a proper name recognition unit that utilizes contextual information, according to an embodiment.
- any of the processes executed on a processing device may also be referred to as modules or components, and may be standalone programs executed locally on a respective device computer, or they can be portions of a distributed client application run on one or more devices.
- the core components of system 100 include a spoken language understanding (SLU) module 104 with multiple understanding strategies for imperfect input, an information-state-update or other kind of dialog manager (DM) 106 that handles multiple dialog threads and mixed initiatives, a knowledge manager (KM) 110 that controls access to ontology-based domain knowledge, and a content optimizer 112 that connects the dialog manager and the knowledge manager for resolving ambiguities from the users' requests, regulating the amount of information to be presented to the user, as well as providing recommendations to users.
- spoken user input 101 produces acoustic waves that are received by a speech recognition unit 102 .
- the speech recognition unit 102 can include components to provide functions, such as dynamic grammars and class-based n-grams.
- a response generator 108 provides the output of the system 100 .
- the response generator 108 generates audio and/or text output based on the user input. Such output can be an answer to a query, a request for clarification or further information, reiteration of the user input, or any other appropriate response.
- the response generator 108 utilizes domain information when generating responses. Thus, different wordings of saying the same thing to the user will often yield very different results.
- System 100 illustrated in FIG. 1 includes a large data store 118 that stores a large number of names.
- name is used to denote any type of entity label, such as the name of a person, place or thing, or any other descriptor or label for an object or entity.
- the number of names in data store 118 could be very large, e.g., on the order of tens to hundreds of thousands of names.
- the large name list can be pared down into a smaller list of names with weight values attached based on the context of the names used in the input speech of recent conversations. Names outside of the smaller list are assigned a weight value of zero.
- Data store 118 can hold names organized into one or more databases.
- One database can be a static database that contains all possible names, commonly used names (such as common trademarks or references), or names frequently used by the user (such as derived from a user profile or model). In a static database, the weight values are precomputed before a conversation is started, and is typically based on frequency of usage.
- a second database may be a dynamic database that constantly takes the names in the context of the utterance (such as names just mentioned) from the DM unit 106 .
- a name list can be built that contains full and partial names that are appended with proper weighting values depending on the context in which the names are used and other characteristics of the names.
- each name in the name list or lists are assigned weights depending upon the databases from which they were derived.
- names from the dynamic database are weighted higher than names from the static database. Weights can be assigned based on any appropriate scale, such as 0 to 100%, or any similar scale, and are used to help the recognition system improve the recognition accuracy.
- the embodiment of system 100 also includes a dialog strategy component 114 .
- the dialog strategy component is invoked when the dialog manager 106 detects that a name is recognized with a relatively low degree of confidence. For names that the dialog manager detects a high enough level of recognition, dialogs are processed through the standard response process defined by the system.
- the dialog strategy component 114 implements a name recognition system that includes an indirect confirmation method. Unlike direct confirmation in which the names uttered by the user are directly restated by the system (e.g., “You said Boston to New York, correct?”), an indirect confirmation system generates new questions for the user that are based on the names, but do not restate the names. This type of system reduces the repetitiveness of direct confirmation, is more conversational, and adds potentially relevant data to the user model. For example, if the user says “I want to fly from Boston to New York” the system my respond by saying “OK, when would you like to leave Massachusetts?” This type of indirect confirmation requires the formulation of a related question based on the properly recognized proper names in the user utterance.
- the indirect confirmation may have been stated as “OK, when would you like to leave Texas?” In this case, the user would need to correct the system by restating the question or clarifying the stated names.
- the indirect confirmation system eliminates the potential problem associated with direct confirmation systems of the user not recognizing that the repeated name was incorrect. That is, if the system stated “Austin” instead of “Boston”, the user may hear “Boston” instead of “Austin”, as he originally anticipated and not realize that the system made a mistake. By formulating a different statement, the system more fully engages the user and provides a different basis of understanding and clarification.
- the related question can be formulated based on different types of information available to the system as well, such as user location, device type, and any other objective information available to the system. For example, if the user is in a car driving through Northern California, and requests that the system find a restaurant in Mountain View, the system may confuse this place name with Monterey. In this case, the system could state back to the user: “As you drive through Silicon Valley . . . . ”
- This indirect confirmation generated by the system utilizes the fact that the location of the user was placed in the vicinity of Silicon Valley rather than the Monterey peninsula and that the user was in an automobile at the time of the request. If the system's understanding was correct, the user could continue the dialog with the system, otherwise he or she could provide correction information. Additional indirect confirmation questions or statements can be provided based on the user response to the system output. The system confidence levels for the speech recognition stage to generate responses until a sufficient level of recognition accuracy is attained.
- FIG. 2 is a block diagram that illustrates the components for generating an indirect confirmation statement, under an embodiment.
- the dialog strategy component takes data from both the user input 202 and objective data sources 204 to generate the indirect confirmation statement or question 210 .
- the objective data 204 could be provided from various sources, such as user profile databases, location sensor, device descriptors, and so on.
- dialog strategy component 114 keeps track of the user utterances, semantic content and data obtained from the user utterances in the past to recognize the current utterance during the interaction. Confidence levels are utilized to measure the accuracy of the recognition. One or more threshold confidence levels may be defined to implement the process. Specifically, if the confidence score of the current recognized utterance is high, the recognized utterance, semantic content and data retrieved from the utterance are used for continuing the interaction with the user. If the confidence score of the recognized utterance or the semantic content is below a certain defined threshold, a related indirect confirmation question or statement is generated and is provided to the user by the system as part of the dialog process.
- FIG. 3 is a flowchart illustrating a method of generating an indirect confirmation statement, under an embodiment.
- the speech recognizer component receives the user utterance, and the system parses the proper name or names in the utterance. The system attempts to recognize the proper name and determines an initial confidence score for this recognition.
- a threshold confidence level is set. In one embodiment, the threshold confidence level is set empirically based on the speech recognizer. The confidence level can be provided by the recognizer 102 unit automatically (such as in the case of a commercially available unit), or it can be defined by a system administrator or designer. Confidence levels are typically specified in a percentage range of 0 to 100%, and a typical threshold value may be around 75-85%.
- a recognizer if a recognizer returns a hypothesis that has a confidence level greater than the threshold, the system will accept the system response as an accurately recognized name. Any value less than the threshold will result in the hypothesis being rejected.
- Different recognizers may have different threshold levels depending on application requirements and system constraints.
- the speech recognizer unit 102 may generate one or more hypotheses of a recognized name. For example, for the flight booking question above, the speech recognizer may produce the following three recognition hypotheses: Boston, Austin, and Houston. Of these three, or any number of hypotheses, one might be selected as better than the others based on the confidence score, or other data. For example, the system may know that the user is on the east coast of the United States at the time of the utterance. In this case, Boston is a better choice than either Austin or Houston, even if one of those city names has a higher confidence score. In block 305 , the system selects the best hypothesis out of the number available. This choice can be made on the basis of confidence score and/or any external information available to the system, and can be dictated by system and/or user defined rules.
- the confidence score of the selected hypothesis is then compared to the defined confidence threshold, block 306 . If the confidence score of the recognized utterance or the semantic content is low, a related question, which is formulated based on contextual information, is prompted to the user by the system, block 308 . The user response to this related question is then received and processed, block 310 . This response is then used to constrain the re-recognition or re-scoring of the previous unconfident user utterances and information obtained in the past interaction, block 312 . This process repeats from block 306 in which the threshold comparison is performed, until a sufficiently high confident result or a high confidence of combined results from the user is obtained. Once the recognized result and information obtained from the answer utterance has a high enough confidence level, that is, one that is greater than the defined threshold, the proper name is accepted as recognized, and the dialog system continues with a normal system response.
- a related question is formed if the confidence level of the selected hypothesis is below the defined confidence threshold.
- the related question can be formulated in different ways.
- the question may be formulated based on the n-best list or lattice produced by the system for the current user utterance, knowledge base or relations in a data base for the applications.
- the n-best list is generated from the speech recognizer which takes the input acoustic signal to produce one or more hypotheses of recognition, and a lattice is a compressed representation of the n-best list.
- the recognized result can be used to constrain the re-recognition or re-scoring of the previous user utterance if it has a high confidence.
- the name candidates are refined based on information collected from user's answer. If more than one hypothesis was available for selection, the iterative process of posing related indirect confirmation questions and refining the confidence scoring will help the system select among the different possible hypotheses. For example, if the hypotheses comprise the following: Boston, Austin, and Houston, a positive user response to the related question “So, you plan to fly out of Massachusetts” will result in the system selecting Boston as the recognized name.
- a highly confident answer can also be used to re-score the previous recognized result and the data retrieved by the user utterance. For instance, if there is an overlap between the user utterances or the data obtained from these user utterances, the confidence for the overlap part is combined by a predefined model or function, e.g., a certain weighted aggregating function. Multiple steps can be performed until a highly confident result or a high confidence of combined results from the user is obtained. In this case, overlaps may comprise repeated words between the system response and user utterances.
- FIG. 4 is a block diagram of the functional components of the dialog strategy component, under an embodiment.
- the dialog strategy component includes a question formulation module 404 that formulates the related questions, a decision making component 406 , and a re-scoring/re-recognition component, 408 .
- the related questions impacts the language model portion of the speech recognizer.
- the language module constrains the search.
- the change of the model will produce different results for following questions. This introduces a degree of dynamic adaptiveness to the system.
- the dialog strategy component uses contextual information that is incorporated in constraining and refining name candidates for speech recognition. Anchoring on the confident portions of the utterance with clarification dialogs can make use of the semantic relation internal in the data to narrow down the types of names for recognition.
- aspects of the name recognition process described herein may be implemented as functionality programmed into any of a variety of circuitry, including programmable logic devices (“PLDs”), such as field programmable gate arrays (“FPGAs”), programmable array logic (“PAL”) devices, electrically programmable logic and memory devices and standard cell-based devices, as well as application specific integrated circuits.
- PLDs programmable logic devices
- FPGAs field programmable gate arrays
- PAL programmable array logic
- Some other possibilities for implementing aspects include: microcontrollers with memory (such as EEPROM), embedded microprocessors, firmware, software, etc.
- aspects of the content serving method may be embodied in microprocessors having software-based circuit emulation, discrete logic (sequential and combinatorial), custom devices, fuzzy (neural) logic, quantum devices, and hybrids of any of the above device types.
- MOSFET metal-oxide semiconductor field-effect transistor
- CMOS complementary metal-oxide semiconductor
- ECL emitter-coupled logic
- polymer technologies e.g., silicon-conjugated polymer and metal-conjugated polymer-metal structures
- mixed analog and digital and so on.
- Computer-readable media in which such formatted data and/or instructions may be embodied include, but are not limited to, non-volatile storage media in various forms (e.g., optical, magnetic or semiconductor storage media) and carrier waves that may be used to transfer such formatted data and/or instructions through wireless, optical, or wired signaling media or any combination thereof.
- Examples of transfers of such formatted data and/or instructions by carrier waves include, but are not limited to, transfers (uploads, downloads, e-mail, etc.) over the Internet and/or other computer networks via one or more data transfer protocols (e.g., HTTP, FTP, SMTP, and so on).
- transfers uploads, downloads, e-mail, etc.
- data transfer protocols e.g., HTTP, FTP, SMTP, and so on.
- the words “comprise,” “comprising,” and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in a sense of “including, but not limited to.” Words using the singular or plural number also include the plural or singular number respectively. Additionally, the words “herein,” “hereunder,” “above,” “below,” and words of similar import refer to this application as a whole and not to any particular portions of this application. When the word “or” is used in reference to a list of two or more items, that word covers all of the following interpretations of the word: any of the items in the list, all of the items in the list and any combination of the items in the list.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
Claims (18)
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/274,267 US8108214B2 (en) | 2008-11-19 | 2008-11-19 | System and method for recognizing proper names in dialog systems |
CN200980154741.2A CN102282609B (en) | 2008-11-19 | 2009-11-13 | System and method for recognizing proper names in dialog systems |
EP09793636.3A EP2359364B1 (en) | 2008-11-19 | 2009-11-13 | System and method for recognizing proper names in dialog systems |
PCT/US2009/064414 WO2010059525A1 (en) | 2008-11-19 | 2009-11-13 | System and method for recognizing proper names in dialog systems |
US13/339,086 US20120101823A1 (en) | 2008-11-19 | 2011-12-28 | System and method for recognizing proper names in dialog systems |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/274,267 US8108214B2 (en) | 2008-11-19 | 2008-11-19 | System and method for recognizing proper names in dialog systems |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/339,086 Continuation US20120101823A1 (en) | 2008-11-19 | 2011-12-28 | System and method for recognizing proper names in dialog systems |
Publications (2)
Publication Number | Publication Date |
---|---|
US20100125456A1 US20100125456A1 (en) | 2010-05-20 |
US8108214B2 true US8108214B2 (en) | 2012-01-31 |
Family
ID=41557545
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/274,267 Active 2030-10-10 US8108214B2 (en) | 2008-11-19 | 2008-11-19 | System and method for recognizing proper names in dialog systems |
US13/339,086 Abandoned US20120101823A1 (en) | 2008-11-19 | 2011-12-28 | System and method for recognizing proper names in dialog systems |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/339,086 Abandoned US20120101823A1 (en) | 2008-11-19 | 2011-12-28 | System and method for recognizing proper names in dialog systems |
Country Status (4)
Country | Link |
---|---|
US (2) | US8108214B2 (en) |
EP (1) | EP2359364B1 (en) |
CN (1) | CN102282609B (en) |
WO (1) | WO2010059525A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110246195A1 (en) * | 2010-03-30 | 2011-10-06 | Nvoq Incorporated | Hierarchical quick note to allow dictated code phrases to be transcribed to standard clauses |
Families Citing this family (184)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8645137B2 (en) | 2000-03-16 | 2014-02-04 | Apple Inc. | Fast, language-independent method for user authentication by voice |
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
US10002189B2 (en) | 2007-12-20 | 2018-06-19 | Apple Inc. | Method and apparatus for searching using an active ontology |
US9330720B2 (en) | 2008-01-03 | 2016-05-03 | Apple Inc. | Methods and apparatus for altering audio output signals |
US8996376B2 (en) | 2008-04-05 | 2015-03-31 | Apple Inc. | Intelligent text-to-speech conversion |
US20100030549A1 (en) | 2008-07-31 | 2010-02-04 | Lee Michael M | Mobile device having human language translation capability with positional feedback |
US8676904B2 (en) | 2008-10-02 | 2014-03-18 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US10706373B2 (en) | 2011-06-03 | 2020-07-07 | Apple Inc. | Performing actions associated with task items that represent tasks to perform |
US10241644B2 (en) | 2011-06-03 | 2019-03-26 | Apple Inc. | Actionable reminder entries |
US10241752B2 (en) | 2011-09-30 | 2019-03-26 | Apple Inc. | Interface for a virtual digital assistant |
US9431006B2 (en) | 2009-07-02 | 2016-08-30 | Apple Inc. | Methods and apparatuses for automatic speech recognition |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
US8682667B2 (en) | 2010-02-25 | 2014-03-25 | Apple Inc. | User profiling for selecting user specific voice input processing information |
JP2011253374A (en) * | 2010-06-02 | 2011-12-15 | Sony Corp | Information processing device, information processing method and program |
US9262612B2 (en) | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
US10057736B2 (en) | 2011-06-03 | 2018-08-21 | Apple Inc. | Active transport based notifications |
US8994660B2 (en) | 2011-08-29 | 2015-03-31 | Apple Inc. | Text correction processing |
US10134385B2 (en) | 2012-03-02 | 2018-11-20 | Apple Inc. | Systems and methods for name pronunciation |
US9280610B2 (en) | 2012-05-14 | 2016-03-08 | Apple Inc. | Crowd sourcing information to fulfill user requests |
US10417037B2 (en) | 2012-05-15 | 2019-09-17 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
US20130317805A1 (en) * | 2012-05-24 | 2013-11-28 | Google Inc. | Systems and methods for detecting real names in different languages |
US9679568B1 (en) | 2012-06-01 | 2017-06-13 | Google Inc. | Training a dialog system using user feedback |
US9123338B1 (en) | 2012-06-01 | 2015-09-01 | Google Inc. | Background audio identification for speech disambiguation |
US9721563B2 (en) | 2012-06-08 | 2017-08-01 | Apple Inc. | Name recognition system |
CN103514165A (en) * | 2012-06-15 | 2014-01-15 | 佳能株式会社 | Method and device for identifying persons mentioned in conversation |
KR102081925B1 (en) * | 2012-08-29 | 2020-02-26 | 엘지전자 주식회사 | display device and speech search method thereof |
US9547647B2 (en) | 2012-09-19 | 2017-01-17 | Apple Inc. | Voice-based media searching |
CN110889265B (en) | 2012-12-28 | 2024-01-30 | 索尼公司 | Information processing apparatus and information processing method |
CN104969289B (en) | 2013-02-07 | 2021-05-28 | 苹果公司 | Voice trigger of digital assistant |
US10652394B2 (en) | 2013-03-14 | 2020-05-12 | Apple Inc. | System and method for processing voicemail |
US10748529B1 (en) | 2013-03-15 | 2020-08-18 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
US20170293610A1 (en) * | 2013-03-15 | 2017-10-12 | Bao Tran | Voice assistant |
CN105027197B (en) * | 2013-03-15 | 2018-12-14 | 苹果公司 | Training at least partly voice command system |
US9818401B2 (en) | 2013-05-30 | 2017-11-14 | Promptu Systems Corporation | Systems and methods for adaptive proper name entity recognition and understanding |
WO2014194299A1 (en) * | 2013-05-30 | 2014-12-04 | Promptu Systems Corporation | Systems and methods for adaptive proper name entity recognition and understanding |
US10170114B2 (en) | 2013-05-30 | 2019-01-01 | Promptu Systems Corporation | Systems and methods for adaptive proper name entity recognition and understanding |
US9449599B2 (en) | 2013-05-30 | 2016-09-20 | Promptu Systems Corporation | Systems and methods for adaptive proper name entity recognition and understanding |
US9582608B2 (en) | 2013-06-07 | 2017-02-28 | Apple Inc. | Unified ranking with entropy-weighted information for phrase-based semantic auto-completion |
WO2014197336A1 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for detecting errors in interactions with a voice-based digital assistant |
WO2014197334A2 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
WO2014197335A1 (en) | 2013-06-08 | 2014-12-11 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
CN110442699A (en) | 2013-06-09 | 2019-11-12 | 苹果公司 | Operate method, computer-readable medium, electronic equipment and the system of digital assistants |
JP6163266B2 (en) | 2013-08-06 | 2017-07-12 | アップル インコーポレイテッド | Automatic activation of smart responses based on activation from remote devices |
US10296160B2 (en) | 2013-12-06 | 2019-05-21 | Apple Inc. | Method for extracting salient dialog usage from live data |
CN103677729B (en) * | 2013-12-18 | 2017-02-08 | 北京搜狗科技发展有限公司 | Voice input method and system |
US10043185B2 (en) | 2014-05-29 | 2018-08-07 | Apple Inc. | User interface for payments |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US9430463B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Exemplar-based natural language processing |
WO2015184186A1 (en) | 2014-05-30 | 2015-12-03 | Apple Inc. | Multi-command single utterance input method |
US9633004B2 (en) | 2014-05-30 | 2017-04-25 | Apple Inc. | Better resolution when referencing to concepts |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US9786276B2 (en) * | 2014-08-25 | 2017-10-10 | Honeywell International Inc. | Speech enabled management system |
WO2016036552A1 (en) | 2014-09-02 | 2016-03-10 | Apple Inc. | User interactions for a mapping application |
US9818400B2 (en) | 2014-09-11 | 2017-11-14 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US10127911B2 (en) | 2014-09-30 | 2018-11-13 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US9668121B2 (en) | 2014-09-30 | 2017-05-30 | Apple Inc. | Social reminders |
US9574896B2 (en) | 2015-02-13 | 2017-02-21 | Apple Inc. | Navigation user interface |
US10152299B2 (en) | 2015-03-06 | 2018-12-11 | Apple Inc. | Reducing response latency of intelligent automated assistants |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
US10460227B2 (en) | 2015-05-15 | 2019-10-29 | Apple Inc. | Virtual assistant in a communication session |
US10200824B2 (en) | 2015-05-27 | 2019-02-05 | Apple Inc. | Systems and methods for proactively identifying and surfacing relevant content on a touch-sensitive device |
US10083688B2 (en) | 2015-05-27 | 2018-09-25 | Apple Inc. | Device voice control for selecting a displayed affordance |
US9940637B2 (en) | 2015-06-05 | 2018-04-10 | Apple Inc. | User interface for loyalty accounts and private label accounts |
US20160358133A1 (en) | 2015-06-05 | 2016-12-08 | Apple Inc. | User interface for loyalty accounts and private label accounts for a wearable device |
US9578173B2 (en) | 2015-06-05 | 2017-02-21 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
US20160378747A1 (en) | 2015-06-29 | 2016-12-29 | Apple Inc. | Virtual assistant for media playback |
CN105161097A (en) * | 2015-07-23 | 2015-12-16 | 百度在线网络技术(北京)有限公司 | Voice interaction method and apparatus |
US10740384B2 (en) | 2015-09-08 | 2020-08-11 | Apple Inc. | Intelligent automated assistant for media search and playback |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10331312B2 (en) | 2015-09-08 | 2019-06-25 | Apple Inc. | Intelligent automated assistant in a media environment |
US9697820B2 (en) | 2015-09-24 | 2017-07-04 | Apple Inc. | Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10956666B2 (en) | 2015-11-09 | 2021-03-23 | Apple Inc. | Unconventional virtual assistant interactions |
KR102450853B1 (en) | 2015-11-30 | 2022-10-04 | 삼성전자주식회사 | Apparatus and method for speech recognition |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
JP6696803B2 (en) * | 2016-03-15 | 2020-05-20 | 本田技研工業株式会社 | Audio processing device and audio processing method |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US11227589B2 (en) | 2016-06-06 | 2022-01-18 | Apple Inc. | Intelligent list reading |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
DK179588B1 (en) | 2016-06-09 | 2019-02-22 | Apple Inc. | Intelligent automated assistant in a home environment |
US12223282B2 (en) | 2016-06-09 | 2025-02-11 | Apple Inc. | Intelligent automated assistant in a home environment |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10586535B2 (en) | 2016-06-10 | 2020-03-10 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
US10621581B2 (en) | 2016-06-11 | 2020-04-14 | Apple Inc. | User interface for transactions |
DK179415B1 (en) | 2016-06-11 | 2018-06-14 | Apple Inc | Intelligent device arbitration and control |
DK179343B1 (en) | 2016-06-11 | 2018-05-14 | Apple Inc | Intelligent task discovery |
DK179049B1 (en) | 2016-06-11 | 2017-09-18 | Apple Inc | Data driven natural language event detection and classification |
DK201670540A1 (en) | 2016-06-11 | 2018-01-08 | Apple Inc | Application integration with a digital assistant |
CN109313759B (en) | 2016-06-11 | 2022-04-26 | 苹果公司 | User interface for transactions |
EP3491541A4 (en) | 2016-07-29 | 2020-02-26 | Microsoft Technology Licensing, LLC | Conversation oriented machine-user interaction |
US10474753B2 (en) | 2016-09-07 | 2019-11-12 | Apple Inc. | Language identification using recurrent neural networks |
AU2017326987B2 (en) * | 2016-09-19 | 2022-08-04 | Promptu Systems Corporation | Systems and methods for adaptive proper name entity recognition and understanding |
US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
US11281993B2 (en) | 2016-12-05 | 2022-03-22 | Apple Inc. | Model and ensemble compression for metric learning |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
US11204787B2 (en) | 2017-01-09 | 2021-12-21 | Apple Inc. | Application integration with a digital assistant |
US10417266B2 (en) | 2017-05-09 | 2019-09-17 | Apple Inc. | Context-aware ranking of intelligent response suggestions |
DK201770383A1 (en) | 2017-05-09 | 2018-12-14 | Apple Inc. | User interface for correcting recognition errors |
US10726832B2 (en) | 2017-05-11 | 2020-07-28 | Apple Inc. | Maintaining privacy of personal information |
DK180048B1 (en) | 2017-05-11 | 2020-02-04 | Apple Inc. | MAINTAINING THE DATA PROTECTION OF PERSONAL INFORMATION |
DK201770439A1 (en) | 2017-05-11 | 2018-12-13 | Apple Inc. | Offline personal assistant |
US10395654B2 (en) | 2017-05-11 | 2019-08-27 | Apple Inc. | Text normalization based on a data-driven learning network |
US11301477B2 (en) | 2017-05-12 | 2022-04-12 | Apple Inc. | Feedback analysis of a digital assistant |
DK179745B1 (en) | 2017-05-12 | 2019-05-01 | Apple Inc. | SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT |
DK201770428A1 (en) | 2017-05-12 | 2019-02-18 | Apple Inc. | Low-latency intelligent automated assistant |
DK179496B1 (en) | 2017-05-12 | 2019-01-15 | Apple Inc. | USER-SPECIFIC Acoustic Models |
DK201770432A1 (en) | 2017-05-15 | 2018-12-21 | Apple Inc. | Hierarchical belief states for digital assistants |
DK201770411A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | MULTI-MODAL INTERFACES |
DK201770431A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
US10311144B2 (en) | 2017-05-16 | 2019-06-04 | Apple Inc. | Emoji word sense disambiguation |
US20180336892A1 (en) | 2017-05-16 | 2018-11-22 | Apple Inc. | Detecting a trigger of a digital assistant |
US10303715B2 (en) | 2017-05-16 | 2019-05-28 | Apple Inc. | Intelligent automated assistant for media exploration |
DK179560B1 (en) | 2017-05-16 | 2019-02-18 | Apple Inc. | Far-field extension for digital assistant services |
US10403278B2 (en) | 2017-05-16 | 2019-09-03 | Apple Inc. | Methods and systems for phonetic matching in digital assistant services |
US10657328B2 (en) | 2017-06-02 | 2020-05-19 | Apple Inc. | Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling |
US10445429B2 (en) | 2017-09-21 | 2019-10-15 | Apple Inc. | Natural language understanding using vocabularies with compressed serialized tries |
US10755051B2 (en) | 2017-09-29 | 2020-08-25 | Apple Inc. | Rule-based natural language processing |
US10636424B2 (en) | 2017-11-30 | 2020-04-28 | Apple Inc. | Multi-turn canned dialog |
US10733982B2 (en) | 2018-01-08 | 2020-08-04 | Apple Inc. | Multi-directional dialog |
US10733375B2 (en) | 2018-01-31 | 2020-08-04 | Apple Inc. | Knowledge-based framework for improving natural language understanding |
US10789959B2 (en) | 2018-03-02 | 2020-09-29 | Apple Inc. | Training speaker recognition models for digital assistants |
US10592604B2 (en) | 2018-03-12 | 2020-03-17 | Apple Inc. | Inverse text normalization for automatic speech recognition |
US10818288B2 (en) | 2018-03-26 | 2020-10-27 | Apple Inc. | Natural assistant interaction |
US10909331B2 (en) | 2018-03-30 | 2021-02-02 | Apple Inc. | Implicit identification of translation payload with neural machine translation |
US11676220B2 (en) | 2018-04-20 | 2023-06-13 | Meta Platforms, Inc. | Processing multimodal user input for assistant systems |
US11886473B2 (en) | 2018-04-20 | 2024-01-30 | Meta Platforms, Inc. | Intent identification for agent matching by assistant systems |
US11715042B1 (en) | 2018-04-20 | 2023-08-01 | Meta Platforms Technologies, Llc | Interpretability of deep reinforcement learning models in assistant systems |
US11307880B2 (en) | 2018-04-20 | 2022-04-19 | Meta Platforms, Inc. | Assisting users with personalized and contextual communication content |
US10963273B2 (en) | 2018-04-20 | 2021-03-30 | Facebook, Inc. | Generating personalized content summaries for users |
US11145294B2 (en) | 2018-05-07 | 2021-10-12 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
US10984780B2 (en) | 2018-05-21 | 2021-04-20 | Apple Inc. | Global semantic word embeddings using bi-directional recurrent neural networks |
US11386266B2 (en) | 2018-06-01 | 2022-07-12 | Apple Inc. | Text correction |
DK179822B1 (en) | 2018-06-01 | 2019-07-12 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
DK201870355A1 (en) | 2018-06-01 | 2019-12-16 | Apple Inc. | Virtual assistant operation in multi-device environments |
DK180639B1 (en) | 2018-06-01 | 2021-11-04 | Apple Inc | DISABILITY OF ATTENTION-ATTENTIVE VIRTUAL ASSISTANT |
US10892996B2 (en) | 2018-06-01 | 2021-01-12 | Apple Inc. | Variable latency device coordination |
US10944859B2 (en) | 2018-06-03 | 2021-03-09 | Apple Inc. | Accelerated task performance |
US11010561B2 (en) | 2018-09-27 | 2021-05-18 | Apple Inc. | Sentiment prediction from textual data |
US11462215B2 (en) | 2018-09-28 | 2022-10-04 | Apple Inc. | Multi-modal inputs for voice commands |
US11170166B2 (en) | 2018-09-28 | 2021-11-09 | Apple Inc. | Neural typographical error modeling via generative adversarial networks |
US10839159B2 (en) | 2018-09-28 | 2020-11-17 | Apple Inc. | Named entity normalization in a spoken dialog system |
US11475898B2 (en) | 2018-10-26 | 2022-10-18 | Apple Inc. | Low-latency multi-speaker speech recognition |
US11638059B2 (en) | 2019-01-04 | 2023-04-25 | Apple Inc. | Content playback on multiple devices |
CN109901810A (en) * | 2019-02-01 | 2019-06-18 | 广州三星通信技术研究有限公司 | A human-computer interaction method and device for intelligent terminal equipment |
US11348573B2 (en) | 2019-03-18 | 2022-05-31 | Apple Inc. | Multimodality in digital assistant systems |
US11423908B2 (en) | 2019-05-06 | 2022-08-23 | Apple Inc. | Interpreting spoken requests |
US11475884B2 (en) | 2019-05-06 | 2022-10-18 | Apple Inc. | Reducing digital assistant latency when a language is incorrectly determined |
US11307752B2 (en) | 2019-05-06 | 2022-04-19 | Apple Inc. | User configurable task triggers |
DK201970509A1 (en) | 2019-05-06 | 2021-01-15 | Apple Inc | Spoken notifications |
US11140099B2 (en) | 2019-05-21 | 2021-10-05 | Apple Inc. | Providing message response suggestions |
US11170170B2 (en) | 2019-05-28 | 2021-11-09 | Fresh Consulting, Inc | System and method for phonetic hashing and named entity linking from output of speech recognition |
US11289073B2 (en) | 2019-05-31 | 2022-03-29 | Apple Inc. | Device text to speech |
DK180129B1 (en) | 2019-05-31 | 2020-06-02 | Apple Inc. | USER ACTIVITY SHORTCUT SUGGESTIONS |
DK201970510A1 (en) | 2019-05-31 | 2021-02-11 | Apple Inc | Voice identification in digital assistant systems |
US11496600B2 (en) | 2019-05-31 | 2022-11-08 | Apple Inc. | Remote execution of machine-learned models |
US11360641B2 (en) | 2019-06-01 | 2022-06-14 | Apple Inc. | Increasing the relevance of new available information |
US11468890B2 (en) | 2019-06-01 | 2022-10-11 | Apple Inc. | Methods and user interfaces for voice-based control of electronic devices |
WO2021056255A1 (en) | 2019-09-25 | 2021-04-01 | Apple Inc. | Text detection using global geometry estimators |
US11061543B1 (en) | 2020-05-11 | 2021-07-13 | Apple Inc. | Providing relevant data items based on context |
US11038934B1 (en) | 2020-05-11 | 2021-06-15 | Apple Inc. | Digital assistant hardware abstraction |
US11755276B2 (en) | 2020-05-12 | 2023-09-12 | Apple Inc. | Reducing description length based on confidence |
US11490204B2 (en) | 2020-07-20 | 2022-11-01 | Apple Inc. | Multi-device audio adjustment coordination |
US11438683B2 (en) | 2020-07-21 | 2022-09-06 | Apple Inc. | User identification using headphones |
US20220383870A1 (en) * | 2021-05-28 | 2022-12-01 | Otis Elevator Company | Usage of voice recognition confidence levels in a passenger interface |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6421672B1 (en) | 1999-07-27 | 2002-07-16 | Verizon Services Corp. | Apparatus for and method of disambiguation of directory listing searches utilizing multiple selectable secondary search keys |
US20030233230A1 (en) * | 2002-06-12 | 2003-12-18 | Lucent Technologies Inc. | System and method for representing and resolving ambiguity in spoken dialogue systems |
US20050049860A1 (en) | 2003-08-29 | 2005-03-03 | Junqua Jean-Claude | Method and apparatus for improved speech recognition with supplementary information |
WO2006111230A1 (en) | 2005-04-19 | 2006-10-26 | Daimlerchrysler Ag | Method for the targeted determination of a complete input data set in a voice dialogue system |
US20060247913A1 (en) * | 2005-04-29 | 2006-11-02 | International Business Machines Corporation | Method, apparatus, and computer program product for one-step correction of voice interaction |
US20080010058A1 (en) | 2006-07-07 | 2008-01-10 | Robert Bosch Corporation | Method and apparatus for recognizing large list of proper names in spoken dialog systems |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7574356B2 (en) * | 2004-07-19 | 2009-08-11 | At&T Intellectual Property Ii, L.P. | System and method for spelling recognition using speech and non-speech input |
JP4671898B2 (en) * | 2006-03-30 | 2011-04-20 | 富士通株式会社 | Speech recognition apparatus, speech recognition method, speech recognition program |
US7991615B2 (en) * | 2007-12-07 | 2011-08-02 | Microsoft Corporation | Grapheme-to-phoneme conversion using acoustic data |
-
2008
- 2008-11-19 US US12/274,267 patent/US8108214B2/en active Active
-
2009
- 2009-11-13 CN CN200980154741.2A patent/CN102282609B/en active Active
- 2009-11-13 WO PCT/US2009/064414 patent/WO2010059525A1/en active Application Filing
- 2009-11-13 EP EP09793636.3A patent/EP2359364B1/en active Active
-
2011
- 2011-12-28 US US13/339,086 patent/US20120101823A1/en not_active Abandoned
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6421672B1 (en) | 1999-07-27 | 2002-07-16 | Verizon Services Corp. | Apparatus for and method of disambiguation of directory listing searches utilizing multiple selectable secondary search keys |
US20030233230A1 (en) * | 2002-06-12 | 2003-12-18 | Lucent Technologies Inc. | System and method for representing and resolving ambiguity in spoken dialogue systems |
US20050049860A1 (en) | 2003-08-29 | 2005-03-03 | Junqua Jean-Claude | Method and apparatus for improved speech recognition with supplementary information |
WO2006111230A1 (en) | 2005-04-19 | 2006-10-26 | Daimlerchrysler Ag | Method for the targeted determination of a complete input data set in a voice dialogue system |
US20060247913A1 (en) * | 2005-04-29 | 2006-11-02 | International Business Machines Corporation | Method, apparatus, and computer program product for one-step correction of voice interaction |
US20100179805A1 (en) * | 2005-04-29 | 2010-07-15 | Nuance Communications, Inc. | Method, apparatus, and computer program product for one-step correction of voice interaction |
US20080010058A1 (en) | 2006-07-07 | 2008-01-10 | Robert Bosch Corporation | Method and apparatus for recognizing large list of proper names in spoken dialog systems |
Non-Patent Citations (5)
Title |
---|
Form PCT/ISA/210, "PCT International Search Report," 3 pgs. |
Form PCT/ISA/220, "PCT Notification of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority, or the Declaration," 1 pg. |
Form PCT/ISA/220, "PCT Notification of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority, or the Declaration," 3 pgs. |
Form PCT/ISA/237, "PCT Written Opinion of the International Searching Authority," 5 pgs. |
Form PCT/ISA/237, "PCT Written Opinion of the International Searching Authority," 6 pgs. |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110246195A1 (en) * | 2010-03-30 | 2011-10-06 | Nvoq Incorporated | Hierarchical quick note to allow dictated code phrases to be transcribed to standard clauses |
US8831940B2 (en) * | 2010-03-30 | 2014-09-09 | Nvoq Incorporated | Hierarchical quick note to allow dictated code phrases to be transcribed to standard clauses |
Also Published As
Publication number | Publication date |
---|---|
CN102282609A (en) | 2011-12-14 |
WO2010059525A1 (en) | 2010-05-27 |
CN102282609B (en) | 2015-05-20 |
EP2359364B1 (en) | 2018-01-10 |
US20120101823A1 (en) | 2012-04-26 |
EP2359364A1 (en) | 2011-08-24 |
US20100125456A1 (en) | 2010-05-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8108214B2 (en) | System and method for recognizing proper names in dialog systems | |
US7925507B2 (en) | Method and apparatus for recognizing large list of proper names in spoken dialog systems | |
US7228275B1 (en) | Speech recognition system having multiple speech recognizers | |
US7027987B1 (en) | Voice interface for a search engine | |
US8612212B2 (en) | Method and system for automatically detecting morphemes in a task classification system using lattices | |
US7139698B1 (en) | System and method for generating morphemes | |
EP2003572B1 (en) | Language understanding device | |
US8380505B2 (en) | System for recognizing speech for searching a database | |
Souvignier et al. | The thoughtful elephant: Strategies for spoken dialog systems | |
US7620548B2 (en) | Method and system for automatic detecting morphemes in a task classification system using lattices | |
US20090055176A1 (en) | Method and System of Optimal Selection Strategy for Statistical Classifications | |
US10152298B1 (en) | Confidence estimation based on frequency | |
EP2028645A1 (en) | Method and system of optimal selection strategy for statistical classifications in dialog systems | |
EP4285358B1 (en) | Instantaneous learning in text-to-speech during dialog | |
US11289075B1 (en) | Routing of natural language inputs to speech processing applications | |
JP4680714B2 (en) | Speech recognition apparatus and speech recognition method | |
US20050187767A1 (en) | Dynamic N-best algorithm to reduce speech recognition errors | |
US7085720B1 (en) | Method for task classification using morphemes | |
US11380308B1 (en) | Natural language processing | |
van den Bosch et al. | Detecting problematic turns in human-machine interactions: Rule-induction versus memory-based learning approaches | |
WO2023148772A1 (en) | A system and method to reduce ambiguity in natural language understanding by user expectation handling | |
WO2024123507A1 (en) | Voice history-based speech biasing | |
US20020087307A1 (en) | Computer-implemented progressive noise scanning method and system | |
JP2006189730A (en) | Speech interactive method and speech interactive device | |
EP4487320A1 (en) | Comparison scoring for hypothesis ranking |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: ROBERT BOSCH GMBH,GERMANY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WENG, FULIANG;SHEN, ZHONGNAN;FENG, ZHE;SIGNING DATES FROM 20090128 TO 20090203;REEL/FRAME:022207/0616 Owner name: ROBERT BOSCH GMBH, GERMANY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WENG, FULIANG;SHEN, ZHONGNAN;FENG, ZHE;SIGNING DATES FROM 20090128 TO 20090203;REEL/FRAME:022207/0616 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
REMI | Maintenance fee reminder mailed | ||
FPAY | Fee payment |
Year of fee payment: 4 |
|
SULP | Surcharge for late payment | ||
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 12 |