US9600587B2 - Methods and apparatuses for generating search expressions from content, for applying search expressions to content collections, and/or for analyzing corresponding search results - Google Patents
Methods and apparatuses for generating search expressions from content, for applying search expressions to content collections, and/or for analyzing corresponding search results Download PDFInfo
- Publication number
- US9600587B2 US9600587B2 US14/930,491 US201514930491A US9600587B2 US 9600587 B2 US9600587 B2 US 9600587B2 US 201514930491 A US201514930491 A US 201514930491A US 9600587 B2 US9600587 B2 US 9600587B2
- Authority
- US
- United States
- Prior art keywords
- search
- search expression
- content item
- content items
- content
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000014509 gene expression Effects 0.000 title claims abstract description 189
- 238000000034 method Methods 0.000 title claims abstract description 62
- 239000000470 constituent Substances 0.000 claims description 34
- 230000006855 networking Effects 0.000 claims description 10
- 230000004044 response Effects 0.000 claims description 7
- 238000001914 filtration Methods 0.000 claims 4
- 230000001131 transforming effect Effects 0.000 claims 3
- 241000282472 Canis lupus familiaris Species 0.000 description 25
- 241000282326 Felis catus Species 0.000 description 15
- 230000001351 cycling effect Effects 0.000 description 14
- 230000006870 function Effects 0.000 description 9
- 238000012360 testing method Methods 0.000 description 7
- 125000002015 acyclic group Chemical group 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 230000008520 organization Effects 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 229910000831 Steel Inorganic materials 0.000 description 3
- 230000000877 morphologic effect Effects 0.000 description 3
- 239000010959 steel Substances 0.000 description 3
- 238000010276 construction Methods 0.000 description 2
- 238000007418 data mining Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000007621 cluster analysis Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G06F17/30867—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2423—Interactive query statement specification based on a database schema
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation
-
- G06F17/30392—
-
- G06F17/30654—
Definitions
- Embodiments of the present invention relate to the field of data processing, in particular, to methods and apparatuses for generating search expressions from content items, for applying search expressions to content collections to generate search results, and for analyzing the search results.
- Data mining technology may support knowledge discovery in databases.
- Information retrieval technology such as latent semantic indexing and concept indexing, may support knowledge discovery in collections of electronic documents.
- Search-by-example systems may apply searches that correspond to content instances, but these systems generally are insensitive to content interrelationships.
- Predictive analytic technology may support predictions of the effects of content publication.
- Natural language understanding technology including relation extraction technology, may support deductive and inductive reasoning based on the contents of electronic texts.
- FIG. 1 illustrates an overview of the search expression generation, search, and analysis methods and apparatuses of the current disclosure, in accordance with various embodiments.
- FIG. 2 illustrates an example non-associative juxtaposition operator reflecting contrasting content configurations, in accordance with various embodiments.
- FIG. 3 illustrates an example computing environment suitable for practicing embodiments of the present disclosure.
- Embodiments of the present invention may include, but are not limited to, generation of search expressions from pre-publication content items, automated discovery of knowledge in collections of published content items, and/or prediction of publication effectiveness of the pre-publication content items.
- Embodiments of the present invention may include a search expression generator (e.g., generator 102 of FIG. 1 ) configured to generate search expressions from content items, e.g., pre-publication content items.
- Content items may include texts, and the search expressions may include juxtaposition operators.
- Construction of search expressions may incorporate parenthesization to indicate the order of application of instances of the juxtaposition operator. Proximity within content items is one realization of juxtaposition, and is an indicator of semantic relatedness.
- the search expressions may be constructed with parentheses and the ## proximity operator. The ## proximity operator may specify a proximity relationship between two sub-expressions. ## may be scalar valued.
- a “recursively parenthesized search expression” is a search expression that includes at least one instance of parenthesization, where some sub-expression lies outside the scope of that instance of parenthesization.
- Parenthesization as used herein, is used in an abstract sense, independently of any particular formalism. For example, parenthesization may be represented using Polish notation. In a recursively parenthesized search expression, instances of a juxtaposition operator may be explicit or implicit.
- a “search expression with nested juxtaposition” is a recursively parenthesized search expression where a juxtaposition operator applies to the constituents of the root search expression and to the constituents of each non-terminal sub-expression.
- the juxtaposition operator may or may not be of fixed arity.
- a search sub-expression may include one or more search expression terminals.
- Search expression terminals may be words (such as “dog”), or phrases (such as “friendly dog”), or word classes that contain words related through regular morphological patterns (such as ⁇ “dog”, “dogs” ⁇ ), or classes of words related through morphological similarity more generally (such as ⁇ “dog”, “dogs”, “doggish”, . . . ⁇ ), or classes of words related through synonymy (such as ⁇ “dog”, “pooch”, . . . ⁇ ) and/or morphological similarity, or classes of words annotated with parts of speech (such as ( ⁇ “dog”, . . .
- search expression terminals may be variants of one or more of the possibilities enumerated earlier.
- Various prior art methods may be employed to assign words to classes.
- Various prior art methods may be employed to detect instances of syntactic patterns or other word patterns in texts.
- Embodiments of the present disclosure may include a search engine (e.g., engine 110 of FIG. 1 ) configured to apply the search expressions to collection or collections of other content items, e.g., prior published content items, to enable identification of content items within the collections that are relevant to the content items, based on which the search expressions were generated.
- a search engine e.g., engine 110 of FIG. 1
- Various embodiments may accommodate words or word sequences in texts as imperfect matches to search expression terminals. For example, “pooch” might be considered as an imperfect match for “dog,” or “the cat saw the friendly dog” might be considered as an imperfect match for “‘cat” to the left of “dog” with no more than one intervening word.’
- Various methods may be employed to assign scores to imperfect matches of search expression terminals.
- various embodiments may incorporate variants of the methods of U.S. Pat. No. 7,987,169, Paragraphs 38-39, e.g., as follows: first, imperfect match scores may be normalized as positive real numbers less than 1, then in place of the formula ⁇ 1 ⁇ i ⁇ k (1/(1+d i ) x ) in Paragraph 38 (yielding the r-value of a word W in text S, where k is the number of perfect or imperfect matches for a given search expression terminal E in a given text, where x (the “distance attenuation exponent”) is a positive real number, and where d i is the distance between W and the i-th occurrence of ⁇ 1 ⁇ i ⁇ k ( ⁇ i /(1+d i ) x ) is used instead, where 0 ⁇ i ⁇ 1 is the score assigned to the i-th perfect or imperfect match for E.
- Various embodiments may similarly adjust final search scores according to scores assigned to imperfect matches to search expression terminals
- Embodiments of the method of Paragraph 10 when dealing with imperfect matches, may assign varying weights to different search expression terminals. For example, because “beagle” occurs more rarely than “dog,” a literal match of “beagle” may be considered more significant than a literal match of “dog.” Thus a literal match of “beagle” might be assigned a weight of 0.89, while a literal match of “dog” might be assigned a weight of 0.27. Moreover, according to the method of Paragraph 10 above, “dog” might be assigned a score of 0.08 as an approximate match for “beagle.” Numeric scores in the preceding two sentences are illustrative only.
- Various embodiments may maintain annotated word and phrase lists with scores for approximate matches, weights for literal matches, and formulas for deriving other weights and scores.
- a formula for deriving weights suppose that data is available that numerically indicates the relative rarity of words. Then words may be assigned weights corresponding to a constant times relative rarity, so that the rarest words are assigned a weight of 1.0, and so that the most common words are assigned scores much closer to 0.0 than to 0.1.
- a formula for deriving scores for approximate matches suppose that data is available that indicates the relative rarity of words, and data is also available indicating which words participate in entailment relations with which other words.
- the score for “dog” as a match for “beagle” can be a constant times the ratio of the relative rarity of “beagle” with the relative rarity of “dog.”
- search expression may be
- FIG. 2 illustrates an example non-associative juxtaposition operator reflecting contrasting content configurations, in accordance with various embodiments.
- the content in FIG. 2 comprises a heading and six paragraphs. “ ——————— ” indicates any word other than “cat,” “food,” or “kitchen.” The content in FIG. 2 may be considered a better match for
- Embodiments of the present disclosure may include a search expression generator (e.g., generator 102 of FIG. 1 ) configured to begin the transformation of a text to a search expression by deleting specified words and phrases from the text. For example, very common words and phrases (such as “the” for English texts) may be deleted. For another example, words and phrases that aren't specific to subject matter (such as “for example” for English texts) may be deleted. Words and phrases to be deleted may be stored in an online dictionary.
- a search expression generator e.g., generator 102 of FIG. 1
- Various embodiments may then proceed by dividing a given text into subtexts, where subtext boundaries correspond to punctuation marks separating phrases, such as commas, punctuation marks separating sentences, such as periods, paired punctuation marks, such as quotation marks and parentheses, starts and ends of text runs in distinctive fonts, phrases separated by connectives such as “and,” paragraph starts and ends, chapter starts and ends, and so on.
- Different boundary markers correspond to different hierarchical levels of containing subtext. For example, chapters contain paragraphs, and paragraphs contain sentences.
- the transformation from text to search expression may further include insertion of instances of a juxtaposition operator between sequence elements, and between sibling parenthesized expressions within the parenthesization hierarchy.
- all or some of the transformations of Paragraph 8 may take place before the division of Paragraph 15. This permits recognition of phrases and patterns that cross indicated boundaries. For example, “the dog, who seldom barks, saw the cat” includes an instance of the subject-verb-object pattern where the subject is separated from the verb and object by an instance of the subject-verb pattern. In cases like this where two patterns cross, various embodiments place the pattern that begins first ahead of the other pattern in sequence.
- embodiments of the present disclosure apply to non-commercial posts to social network sites, to blog posts, to bulletin board posts, and to other collections of short electronic texts. Embodiments of the present disclosure also apply to collections of longer electronic texts, such as case law databases, and to electronic documents more generally. And embodiments of the present disclosure also apply to more highly configured content, such as Web pages.
- the hierarchy of parenthesized sub-expressions in generated search expressions reflects a containment hierarchy of content items.
- the set of words and phrases to be deleted becomes larger for content with greater total amounts of text.
- Some of these embodiments may use a dictionary of words and phrases to be retained (keywords), rather than a dictionary of words and phrases to be deleted.
- logical connectives may not be preserved, when generating search expressions from texts, either working from literal texts or from logical analyses of texts' meanings.
- a search expression such as
- content collection may refer to a database with textual content items, to two or more associated databases with textual content items, to one or more associated views derived from such databases, to a repository of documents associated with one or more databases or database views, or to a repository of documents not associated with any database or database view.
- a single document may count as instance of a “content collection,” and similarly for a single database object/record.
- the term “document” is a specie of “content item.”
- Embodiments of the present disclosure may include a content item configurator (e.g., configurator 108 of FIG. 1 ), configured to compute configuration-related data for content collections.
- embodiments may use prior art methods of information retrieval to organize document collections into directed acyclic graphs of sub-collections, and use prior art methods of data mining to organize content collections into directed acyclic graphs of database objects/records.
- content collections assigned to parent nodes contain as subsets the collections assign to their child nodes.
- various embodiments may use various prior art methods to assign attributes to directed acyclic graph nodes.
- Embodiments of the search engine may use the methods of U.S. Pat. No. 7,987,169 to establish “relevance geometries” on directed acyclic graphs of content collections, establishing “relevance sizes” for directed acyclic graph nodes, and establishing “relevance distances” between pairs of nodes.
- Posts may be configured as a directed acylic graph comprising two trees. In both trees, terminal nodes correspond to individual posts, and each parent of terminal nodes_corresponds to the set of posts promoting a particular product.
- the first tree categorizes and sub-categorizes products by function and construction.
- the “clothing” node dominates the “headgear” node dominates the “billed cap” node, and so on.
- Many products may be themed with cartoon characters. Every product may have a target demographic.
- the second tree categorizes and sub-categorizes products by theme and target demographic.
- the “three-year-old to eighteen-year-old” node dominates the “three-year-old to eleven-year-old” node dominates the “cartoon animal character” node dominates the “Celeste Bluebird” node, and so on.
- the “Celeste Bluebird billed cap” node is the child of the “billed cap” node in the first tree, and is the child of the “Celeste Bluebird” node in the second tree.
- the above trees are illustrative, and not to be construed as limiting on the present disclosure; many variants of this organization into a directed acyclic graph are possible.
- the search engine may determine relevance distance by criteria similar to the criteria that determine the organization of content collections into a tree.
- different determinations of relevance distance among sibling nodes may apply according to different trees within a directed acyclic graph.
- the relevance distance between two “billed caps” under the “Celeste Bluebird billed caps” node might be a function of the relative lengths of their bills.
- the relevance distance between two “Celeste Bluebird billed caps” might be a function of how the caps' depictions of Celeste Bluebird are posed (full body side view, head front view, . . . ).
- relevance distance for one tree may be determined by criteria similar to the criteria that determine the organization of another tree. In other alternative embodiments, relevance distance for a tree may be determined by criteria different from any of the criteria that determine the organization of trees in the directed acyclic graph.
- the relevance size of a set of posts might be a function of the number of posts in the set, the average monthly sales in dollars of the products featured in the posts in the set, and/or the average number of positive reader reactions to posts in the set.
- a candidate post is to be assessed for answers to the following three questions: What is the expected reception of this post? When is the best time to issue this post (e.g., best day of the week, best time of the day)? Which parts of the post, if any, should be rewritten to improve the expected reception of the post?
- the search expression corresponding to the candidate post may be applied to a directed acylic graph as sketched above, e.g., by a search engine, according to the search methods of U.S. Pat. No. 7,987,169, or according to other search methods.
- the highest scoring sets of prior posts possibly restricted according to a minimum threshold score, may be weighted according to search score and assessed according to available data on reader reactions.
- Sets of posts may also be weighted according to cardinality, and according to hierarchical position in the various trees of the directed acyclic graph. For example, posts lower in hierarchies may be considered to provide more accurate prediction.
- predictions based on search scores, and thus based on the content of the candidate post may be balanced against predictions determined by other means.
- the search expression corresponding to the candidate post may be applied as in Paragraph 31, with results weighted according to search scores. Such weighted results may then be assessed by e.g., a search result analyzer (such as, analyzer 112 ), according to available data on how timing (e.g., day of the week, time of the day) correlates with reader reactions. Recommended timings derived from these assessments may be balanced against recommendations determined by other means.
- prior posts may be organized into a tree or directed acyclic graph based on timings. For example, prior posts may be organized into a tree based on time of day, and into a second tree based on day of week.
- the search expression based on the candidate post may be applied to such timing-based tree or trees, with the best scoring timings deemed to be recommended timings. These timing recommendations may be balanced against recommendations determined by other means.
- a sub-expression of the search expression may be applied, e.g., by a search engine, to the directed acylic graph according to the search methods of U.S. Pat. No. 7,987,169 or according to other search algorithms. If a particular set of posts has a high search score for a particular sub-expression, and if that set of posts has negative characteristics, such as a high average incidence of negative reactions from readers, that may serve as an indication that the part of the post corresponding to the sub-expression may be problematic.
- Various embodiments may configure content collections after a particular search expression is formulated, and before the search expression is applied. Some of these embodiments may form a sub-collection of a content collection according to the search expression and possibly other considerations, and then configure this sub-collection prior to applying the search expression to just that sub-collection.
- Alternative embodiments may configure content collections independently of any particular search expressions. Some of these embodiments may store the results of configuration so that these results are available when search expressions are applied. For example, configuration results for a database may be stored in the database itself, with the database scheme modified to accommodate configuration-related data. Various embodiments may use prior art methods, such as cluster analysis, to generate configuration-related data. Various embodiments may use prior art methods to dynamically maintain configurations as content collections are updated. Various embodiments may complement search-expression-independent configuration-related data with configuration-related data that is calculated after a search expression is generated.
- a candidate text for publication may contain one or more words that are not found in a given content collection. For example, a proper name within a candidate text might not occur within the content collection. In embodiments, the candidate text might nevertheless be considered to match the content collection.
- a content collection might be an apparently closer match for a candidate text than a second content collection, even though all the words of the candidate text occur in the second content collection, but not in the first content collection.
- embodiments of the search engine may assign positive search scores to content collections that do not match all the search expression terminals within a search expression, either literally or approximately. For a given search expression E, these embodiments may calculate partial-match scores for E as distinguished from full-match scores for E, where the partial-match score for E is a function of the full-match scores of the sub-expressions of E. Some embodiments may assign positive partial-match scores even in cases where there's a positive full-match score.
- the partial-match score of N for E is a function of the full-match scores of the sub-expressions of E, with weight assigned to the full-match score of a sub-expression E′ of E according to the hierarchical level of E′ in a parse tree of E.
- Parse trees for search expressions may be derived from the simple grammar of E ⁇ C, E ⁇ E o (C), E ⁇ (C) o E, E ⁇ e, where e is any atomic search expression, C ⁇ E o E, C ⁇ E o E o E, C ⁇ E o E o E o E, . .
- C corresponds to a search expression that directly includes at least one occurrence of the juxtaposition operator o.
- parsing may rewrite occurrences of C as E.
- the relevance scores of the sub-expressions may be summed, and the sum may then divided by the number of sub-expressions at that hierarchical level, yielding ⁇ k , the average relevance score of the sub-expressions at level k, where the root node of E is at level 0, the children of the root node are at level 1, and so on.
- the partial-match score of E for N may be calculated as a weighted average of the ⁇ k .
- Weighting the average of the ⁇ k according to level assigns relative significance of partial matches based on how much of E is matched. For example, some of these embodiments may calculate partial match score as ⁇ 1 ⁇ k ⁇ L ( ⁇ k *(k+c) e ), where L+1 is the number of hierarchical levels in the parse tree of normalized E, where c>0, and where e ⁇ 0.
- the partial-match score of N for E is a function of the full-match scores of the sub-expressions of E, where the full-match scores of sub-expressions are not weighted according to hierarchical levels.
- a content collection may receive a partial-match score of 0 for E. More generally, according to some embodiments, if a content collection has a positive full-match score for a sub-expression E′ of E, then full-match scores of sub-expressions of E′ may be excluded in the calculation of the partial-match score of E. According to alternative embodiments, a content collection may receive both a positive full-match score and a positive partial-match score. These alternative embodiments may distinguish cases where N matches E poorly but matches some sub-expressions of E well, from cases where N matches E poorly and matches all sub-expressions of E poorly.
- the partial-match score and the full-match score may be combined according to the methods of Paragraph 145 of U.S. Patent Publication 2009-0254549, where both partial-match score and full-match score are “beneficial” properties.
- the root node of E may be treated similarly to other nodes in the parse tree of E when calculating partial match scores.
- the formula of Paragraph 39 may be modified to ⁇ 0 ⁇ k ⁇ L ( ⁇ k *(k+c) e ), so that the sum includes the full match score.
- Embodiments of the search engine may assign search scores to content collections within directed acyclic graphs of content collections, where the parent-child relationship in directed acyclic graphs corresponds to the superset-subset relationship.
- Proximity between content collections is one realization of juxtaposition, and is an indicator of semantic relatedness.
- Various embodiments may use methods that establish measures of proximity between content collections. Given proximity measurements between content collections, various embodiments may use search methods that are sensitive to these proximity measurements.
- Various embodiments may use search methods for which the proximity operator is non-associative. Among these embodiments, various embodiments may incorporate the search methods and apparatuses of U.S. Pat. No. 7,987,169, and U.S. Patent Publication 2009-0254549.
- Embodiments of the analyzer may apply prior art methods of predictive analytics, and of statistical analysis more generally, to content collections that receive positive search scores.
- Various embodiments may include search scores among the parameters that determine the level of confidence associated with analysis. Higher search scores may indicate closer correspondence to the search expression, and thus may indicate greater similarity to the candidate text.
- Various embodiments may apply analytics only to content collections whose search scores are greater than a fixed threshold score.
- Various other embodiments may apply analytics only to content collections whose search scores are greater than a constant times the maximum search score obtained for any content collection.
- Various other embodiments may use multiple parameters to determine which content collections receive application of which analytics.
- FIG. 1 wherein a block diagram illustrating an overview of an arrangement configured to practice the search and analysis methods, in accordance with various embodiments, is shown.
- the text included within a content item 101 may be directed to search expression generator 102 , incorporated with the teachings of the present disclosure, which may, in response, produce a search expression 103 , as earlier described.
- the text included within content item 101 may also be directed to primary query generator 104 , which may, in response, produce a primary query 105 , also as earlier described.
- Primary query 105 may be generated according to the content retrieval functionality associated with content item database or repository 106 .
- primary query 105 is an SQL query.
- Primary query 105 may be directed to content item database or repository 106 , which may, in response, produce a set of prior content items with associated data 107 , where the associated data may include configuration-related data stored in content item database or repository 106 .
- Set of prior content items with associated data 107 may be directed to prior content item configurator 108 , incorporated with the teachings of the present disclosure.
- Prior content item configurator 108 or a module incorporating similar methods, may also be applied independently of any search expressions, with results stored in content item database or repository 106 .
- Prior content item configurator 108 may, in response, produce a directed acyclic graph (which may be a simple tree) 109 of prior content items, possibly with relevance geometries as described in U.S. Pat. No. 7,987,169, and possibly with associated data.
- Search expression 103 and directed acyclic graph 109 of prior content items with relevance geometries and with associated data, may then be directed to search engine 110 .
- Search engine 110 may, in response, produce search results 111 , comprising scores assigned to nodes of directed acylic graph 109 . These scores may indicate which sets of prior content items correspond most closely to search expression 103 , and thus to candidate content item 101 .
- Search results analyzer 112 may analyze these sets of prior content items, with both scores assigned to directed acyclic graph nodes and data associated with content items as inputs, and may, in response, produce, among other possible outputs, predicted effects 113 of publishing candidate content item 101 , recommended circumstances 114 for publishing candidate content item 101 , and recommended parts 115 of candidate content item 101 to rewrite.
- FIG. 2 illustrates an example non-associative juxtaposition operator reflecting contrasting content configurations.
- the content in FIG. 2 comprises a heading and six paragraphs. “———————” indicates any word other than “cat,” “food,” or “kitchen.” The content in FIG. 2 is a better match for
- an entertainment description such as a description of a film or a song
- an advertisement may be in the form of a database record that includes a text field that characterizes the product advertised.
- methods of previous paragraphs may apply to constituents of content items. For example, given a news article, methods of the previous paragraphs may generate a search expression from a paragraph of the news article.
- methods of previous paragraphs may apply to collections of content items and/or content item constituents.
- a content collection is organized hierarchically, with content collections as parents of content items, content item constituents, and/or other content collections, the search expression corresponding to a parent content collection may be the juxtaposition of the search expressions corresponding to the children of the parent
- the search expression corresponding to a parent cluster may be the juxtaposition of the search expressions corresponding to the children of the parent cluster.
- Content items and constituents of content items that do not comprise texts may have associated texts.
- an HTML product description may include an image with a text assigned to the “alt” attribute.
- an advertisement description in a database may include as constituents various text fields.
- Various embodiments treat texts associated with constituents similarly to the way they treat text constituents.
- a simple product description comprises a title “Acme Garden Rake,” a text description “This steel rake is sturdy and light,” and an uncaptioned image with “gathering autumn leaves” assigned to its “alt” attribute
- one possible search expression generated from the product description may be (acme o garden o rake) o (steel o rake o sturdy o light) o (gathering o autumn o leaves).
- Various embodiments may assign weights to sub-expressions of the generated search expression according to the roles of the sub-expressions' corresponding source content.
- the sub-expression corresponding to the title (acme o garden o rake) may be assigned a greater weight than the sub-expression corresponding to the text description (steel o rake o sturdy o light), reflecting the presumed greater significance of the title.
- Search expressions where sub-expressions are assigned varying weights can be evaluated so that search scores for weighting of sub-expressions is reflected in search scores, as disclosed in U.S. application Ser. No. 14/105,034, entitled METHODS AND APPARATUSES FOR CONTENT PREPARATION AND/OR SELECTION, filed on Dec. 12, 2013, having common inventorship with the present application.
- “terms” are words, designated phrases, or designated classes of words and phrases, where members of the same class share the same root or share one or more other designated properties.
- Various embodiments filter terms from search expressions according to criteria based on term frequency within content from which the search expression may be generated and/or based on counts of documents that contain terms. For example, various embodiments use various term-frequency inverse-document-frequency weighting schemes of prior art to determine which words and phrases to filter.
- Various embodiments may calculate term frequency for an instance of content from multiple content instances, and then apply formulas for aggregate term frequency that distinguish among the source content instances, where content instances may be content items, constituents of content items, or content collections.
- Various embodiments may calculate inverse document frequency for an instance of content from multiple content instances, and then apply formulas for aggregate inverse document frequency that distinguish among the source content instances. For example, consider a directed acyclic graph of clusters of news articles that has been established according to prior art clustering methods, and consider a news article within this directed acyclic graph that is contained within two parent clusters, each of which has a single parent cluster and a single grandparent cluster, where the two grandparent clusters are children of a common great-grandparent cluster.
- tf-idf scores term-frequency-inverse-document-frequency scores
- the tf-idf score for a term and for this news article may be (1+log(t 1 +0.5*t 2 ))*log(1+0.5*N 3 /d 3 +(N 4 ⁇ N 3 )/(d 4 ⁇ d 3 )), where t 1 is the number of occurrences the term in the news article, t 2 is the number of occurrences of the term in other news articles in the parent clusters, N 3 is the number of documents in the two grandparent clusters together, N 4 is the number of documents in the great-grandparent cluster, d 3 is the number of documents within the two grandparent clusters together that contain the term, and d 4 is is the number of documents within the great-grandparent cluster that contain the term.
- Various embodiments may assign weights to words and phrases of generated search expressions according to criteria based on term frequency within content from which the search expression is generated and/or based on document frequency within other content.
- a search expression generated from a content collection C may of course be applied to content collections that have varying amounts of content shared with C.
- a search expression generated from C may be applied to a content collection that has nothing in common with C except the appearances of certain words.
- various embodiments of the present disclosure may assign weights to sub-expressions according to user inputs that distinguish among constituents of the content items. For example, suppose that a search expression is to be generated from a news article that a user is reading in a Web browser, and that the generated search expression is to be used to search for current news articles to be recommended to the user.
- the user can highlight one or more areas of the browser presentation of the news article, employing a user interface feature incorporated in the Web page that contains the news article, or a user interface feature incorporated in the browser.
- the user interface feature may allow different levels of highlighting. Sub-expressions of the generated search expression that correspond to constituents of the news article that appear in highlighted areas may be assigned greater weights than constituents that do not appear in highlighted areas. Different levels of highlighting may correspond to differences among assigned weights.
- various embodiments may assign weights to sub-expressions according to user inputs that distinguish among sub-collections of collections, elements of collections, and constituents of elements of collections. For example, according to various embodiments, during a session on a shopping Web site, the user can indicate products of interest by checking provided check boxes on product pages, or by indicating levels of interest with sliders provided on product pages, or similar. Sub-expressions of the generated search expression may be weighted so that sub-expressions that correspond to checked products may be assigned greater weights than sub-expressions that do not correspond to checked products. For user interfaces that accommodate indication of degree of interest, as with sliders, different levels of indicated interest may correspond to differences among assigned weights.
- various embodiments may move and/or copy the sub-expression within the search expression. Some of these embodiments may move and/or copy sub-expressions instead of assigning weights to sub-expressions. Other embodiments may move and/or copy sub-expressions in addition to assigning weights to sub-expressions.
- copying a sub-expression consider a news article where a user has highlighted one paragraph. Suppose that E is the full search expression for the news article that would be generated if no constituent of the news article is distinguished based on user inputs or user-related data. Suppose that E′ is the sub-expression of E that corresponds to the highlighted paragraph. Then E o E′ may be an example of a search expression that results from copying E′, reflecting the significance of the highlighted paragraph.
- Various embodiments of the present disclosure may use the results of searches with generated search expressions to generate recommendations.
- the objects of generated recommendations may include but are not limited to documents such as news articles, Wikipedia articles, and Web pages, sub-documents such as paragraphs within news articles, document collections such as the works in a bibliography, product recommendations such as one or more brief product presentations on a retailer Web page that presents another product in detail, entertainment recommendations such as descriptive links that lead directly or indirectly to sound or video files, and advertisements.
- Intended audiences for generated recommendations may include but are not limited to single users, sets of users, and automated systems such as advertising servers.
- the forms in which generated recommendations are provided may include but are not limited to Web links, excerpts from recommended content, descriptions of recommended content, and instructions to automated systems to present content.
- Embodiments of the present disclosure may generate recommendations in conjunction with various prior art technologies.
- a search expression with nested juxtaposition sub-expressions may be generated according to methods of previous paragraphs and applied to a collection of news articles that have been published during a specified time window.
- the news articles that score highest in the search may then be presented to the user in the form of a list of descriptive links that appear on the Web page that includes the news article that the user is reading.
- the generated search expression may be applied to a collection of advertisements paired with accompanying descriptions, and the advertisement that scores highest may then be presented to the user on the Web page that includes the news article that the user is reading.
- recommendation generation may iteratively invoke a search expression generation step followed by a search step.
- the search expression generated from three news articles may be applied to a collection of news articles associated with a time window extending back for months or years.
- the collection corresponding to the top results of this first generate-and-search iteration may then be used to generate a search expression which is then applied to a collection of news articles associated with a time window extending back for minutes or hours.
- the news articles that score highest in this second search may then be presented to the user in the form of a list of descriptive links that appear on the Web page that includes the news article that the user is reading.
- the user clusters that score highest in this first search may then be used to generate a second search expression, where the second search expression reflects product descriptions within purchasing histories within user clusters, respecting nesting within product descriptions, nesting among user clusters, and possibly nested structures within individual purchasing histories. Products whose descriptions score highest in this second search may then be presented to the user together with product recommendations based on the user's purchasing history and viewing history, including the current product in the viewing history. It may be emphasized that these examples serve only to illustrate iterative generate-and-search.
- FIG. 3 illustrates an architecture view of a computing device 300 , such as a server, a desktop computer, a tablet or a PDA, suitable for practicing methods of the present disclosure, in accordance with various embodiments.
- Computing device 300 may be a server or a client. Whether as a server or client, computing device 300 may be coupled to clients or server via a wireless or wireline based interconnection, over one or more private and/or public networks, including the famous public network “Internet”.
- computing device 300 may includes elements found in conventional computing device, such as (single or multi-core) micro-controller/processor 302 , digital signal processor (DSP) 304 , non-transitory storage medium such as non-volatile memory 306 , display 308 , input keys 310 (such as 12 key pad, select button, D-unit), and transmit/receive (TX/RX) 312 , coupled to each other via bus 314 , which may be a single bus or an hierarchy of bridged buses.
- non-volatile memory 306 includes operating logic 320 adapted to implement the earlier described components, search expression generator, search engine, analyzer, configurator, and so forth, to practice the earlier described methods or some of the operations. The implementation may be via any one of a number programming languages, assembly, C, and so forth.
- all or portions of the end user interface may be implemented in hardware, firmware, or combination thereof.
- Hardware implementations may be in the form of application specific integrated circuit (ASIC), reconfigured reconfigurable circuits (such as Field Programming Field Array (FPGA)), and so forth.
- ASIC application specific integrated circuit
- FPGA Field Programming Field Array
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Human Computer Interaction (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
-
- dog o ((cat o food) o kitchen)
where o is a juxtaposition operator.
- dog o ((cat o food) o kitchen)
-
- cat o (food o kitchen)
than for - (cat o food) o kitchen
where o is a non-associative juxtaposition operator.
- cat o (food o kitchen)
-
- In a test of the 2011 Mavic R-Sys SLR Wheel-Tyre System, the UK cycling magazine Cycling Plus awarded Editor's Choice to the Mavic R-Sys. “These are among the lightest and smoothest wheels we've ever tested, with hard, out-of-the-saddle sprints, big lean angles in corners, and rough roads and cobbles all handled with the minimum of fuss,” wrote the reviewers.
- Applying operations described in Paragraph 14 can yield:
- test 2011 Mavic R-Sys SLR Wheel-Tyre System, UK cycling magazine Cycling Plus awarded Editor's Choice Mavic R-Sys. “lightest smoothest wheels tested, hard, out-of-the-saddle sprints, big lean angles corners, rough roads cobbles handled minimum fuss,” wrote reviewers.
- Applying operations described in Paragraph 15 can then yield:
- ((test 2011 Mavic “R-Sys” SLR “Wheel-Tyre” System) (UK cycling magazine Cycling Plus awarded Editor's Choice Mavic “R-Sys”)) (((lightest smoothest wheels tested) hard (“out-of-the-saddle” sprints) (big lean angles corners) (rough roads cobbles handled minimum fuss)) wrote reviewers)
- Applying operations described in Paragraph 17 can then yield the search expression:
- ((test o 2011 o Mavic o “R-Sys” o SLR o “Wheel-Tyre” o System) o (UK o cycling o magazine o Cycling o Plus o awarded o Editor's o Choice o Mavic o “R-Sys”)) o (((lightest o smoothest o wheels o tested) o hard o (“out-of-the-saddle” o sprints) o (big o lean o angles o corners) o (rough o roads o cobbles o handled o minimum o fuss)) o wrote o reviewers)
where o is a juxtaposition operator.
- ((test o 2011 o Mavic o “R-Sys” o SLR o “Wheel-Tyre” o System) o (UK o cycling o magazine o Cycling o Plus o awarded o Editor's o Choice o Mavic o “R-Sys”)) o (((lightest o smoothest o wheels o tested) o hard o (“out-of-the-saddle” o sprints) o (big o lean o angles o corners) o (rough o roads o cobbles o handled o minimum o fuss)) o wrote o reviewers)
-
- ((test 2011 Mavic “R-Sys” SLR “Wheel-Tyre” System) (UK cycling magazine Cycling Plus awarded Editor's Choice Mavic “R-Sys”)) ((((lightest) (smoothest wheels tested)) hard (“out-of-the-saddle” sprints) (big lean angles corners) (((rough roads) (cobbles handled minimum fuss)))) wrote reviewers)
- Incorporating a more sophisticated understanding of English coordination might yield instead:
- ((test 2011 Mavic “R-Sys” SLR “Wheel-Tyre” System) UK cycling magazine Cycling Plus awarded Editor's Choice Mavic “R-Sys”) ((((lightest smoothest) wheels tested) ((hard (“out-of-the-saddle” sprints)) (big lean angles corners) (rough (roads cobbles)) handled minimum fuss)) wrote reviewers)
- Additionally incorporating knowledge of noun compounds might yield:
- ((test (2011 (Mavic (“R-Sys” SLR (“Wheel-Tyre” System))))) UK (cycling magazine) (Cycling Plus) awarded (Editor's Choice) (Mavic “R-Sys”)) ((((lightest smoothest) wheels tested) ((hard (“out-of-the-saddle” sprints)) (big lean angles corners) (rough (roads cobbles)) handled minimum fuss)) wrote reviewers)
-
- “dog AND NOT cat”
is a request for content where “dog” appears and “cat” does not appear, and - “the dog didn't bite the cat”
for example, does not correspond to this search expression. In alternate embodiments, logical operators may be preserved. For various ones of these embodiments, search expressions may be normalized by moving negative operators inward prior to calculating scores.
- “dog AND NOT cat”
-
- cat o (food o kitchen)
than for - (cat o food) o kitchen
where o is a non-associative juxtaposition operator.
VII. Generation of Search Expressions from Additional Categories of Content
- cat o (food o kitchen)
Claims (21)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/930,491 US9600587B2 (en) | 2011-10-19 | 2015-11-02 | Methods and apparatuses for generating search expressions from content, for applying search expressions to content collections, and/or for analyzing corresponding search results |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/276,778 US9208218B2 (en) | 2011-10-19 | 2011-10-19 | Methods and apparatuses for generating search expressions from content, for applying search expressions to content collections, and/or for analyzing corresponding search results |
US14/930,491 US9600587B2 (en) | 2011-10-19 | 2015-11-02 | Methods and apparatuses for generating search expressions from content, for applying search expressions to content collections, and/or for analyzing corresponding search results |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/276,778 Continuation-In-Part US9208218B2 (en) | 2011-10-19 | 2011-10-19 | Methods and apparatuses for generating search expressions from content, for applying search expressions to content collections, and/or for analyzing corresponding search results |
Publications (2)
Publication Number | Publication Date |
---|---|
US20160070807A1 US20160070807A1 (en) | 2016-03-10 |
US9600587B2 true US9600587B2 (en) | 2017-03-21 |
Family
ID=55437708
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/930,491 Active US9600587B2 (en) | 2011-10-19 | 2015-11-02 | Methods and apparatuses for generating search expressions from content, for applying search expressions to content collections, and/or for analyzing corresponding search results |
Country Status (1)
Country | Link |
---|---|
US (1) | US9600587B2 (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11048765B1 (en) | 2008-06-25 | 2021-06-29 | Richard Paiz | Search engine optimizer |
US10922363B1 (en) * | 2010-04-21 | 2021-02-16 | Richard Paiz | Codex search patterns |
US11809506B1 (en) | 2013-02-26 | 2023-11-07 | Richard Paiz | Multivariant analyzing replicating intelligent ambience evolving system |
US11741090B1 (en) | 2013-02-26 | 2023-08-29 | Richard Paiz | Site rank codex search patterns |
US10274440B2 (en) * | 2016-06-22 | 2019-04-30 | International Business Machines Corporation | Method to facilitate investigation of chemical constituents in chemical analysis data |
CN107229700A (en) * | 2017-05-24 | 2017-10-03 | 成都明途科技有限公司 | A kind of intelligent recommendation system of government affairs data and news |
CN109145218B (en) * | 2018-09-10 | 2021-11-02 | 北京一点网聚科技有限公司 | Article recommendation method and device |
US12164367B2 (en) * | 2020-12-18 | 2024-12-10 | Intel Corporation | Systems, methods, and devices for reducing systemic risks |
Citations (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3568156A (en) | 1967-08-09 | 1971-03-02 | Bell Telephone Labor Inc | Text matching algorithm |
US20020126905A1 (en) | 2001-03-07 | 2002-09-12 | Kabushiki Kaisha Toshiba | Mathematical expression recognizing device, mathematical expression recognizing method, character recognizing device and character recognizing method |
US6658404B1 (en) | 1999-09-20 | 2003-12-02 | Libera, Inc. | Single graphical approach for representing and merging boolean logic and mathematical relationship operators |
US20060010105A1 (en) | 2004-07-08 | 2006-01-12 | Sarukkai Ramesh R | Database search system and method of determining a value of a keyword in a search |
US20060074903A1 (en) | 2004-09-30 | 2006-04-06 | Microsoft Corporation | System and method for ranking search results using click distance |
US20060195427A1 (en) | 2005-02-25 | 2006-08-31 | International Business Machines Corporation | System and method for improving query response time in a relational database (RDB) system by managing the number of unique table aliases defined within an RDB-specific search expression |
US20070022099A1 (en) | 2005-04-12 | 2007-01-25 | Fuji Xerox Co., Ltd. | Question answering system, data search method, and computer program |
US20070061384A1 (en) | 2003-07-30 | 2007-03-15 | Xerox Corporation | Multi-versioned documents and method for creation and use thereof |
US20070288438A1 (en) | 2006-06-12 | 2007-12-13 | Zalag Corporation | Methods and apparatuses for searching content |
US20080033938A1 (en) | 2006-08-03 | 2008-02-07 | Kabushiki Kaisha Toshiba | Keyword outputting apparatus, keyword outputting method, and keyword outputting computer program product |
US20080065685A1 (en) | 2006-08-04 | 2008-03-13 | Metacarta, Inc. | Systems and methods for presenting results of geographic text searches |
US7383504B1 (en) | 1999-08-30 | 2008-06-03 | Mitsubishi Electric Research Laboratories | Method for representing and comparing multimedia content according to rank |
US20080235187A1 (en) | 2007-03-23 | 2008-09-25 | Microsoft Corporation | Related search queries for a webpage and their applications |
US20090012946A1 (en) | 2007-07-02 | 2009-01-08 | Sony Corporation | Information processing apparatus, and method and system for searching for reputation of content |
US20090119258A1 (en) | 2007-11-05 | 2009-05-07 | William Petty | System and method for content ranking and reviewer selection |
KR100913049B1 (en) | 2008-01-29 | 2009-08-20 | 엔에이치엔(주) | Method and system for providing positive / negative search result using user preference |
US20100250547A1 (en) | 2001-08-13 | 2010-09-30 | Xerox Corporation | System for Automatically Generating Queries |
US20130103662A1 (en) | 2011-10-19 | 2013-04-25 | Zalag Corporation | Methods and apparatuses for generating search expressions from content, for applying search expressions to content collections, and/or for analyzing corresponding search results |
-
2015
- 2015-11-02 US US14/930,491 patent/US9600587B2/en active Active
Patent Citations (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3568156A (en) | 1967-08-09 | 1971-03-02 | Bell Telephone Labor Inc | Text matching algorithm |
US7383504B1 (en) | 1999-08-30 | 2008-06-03 | Mitsubishi Electric Research Laboratories | Method for representing and comparing multimedia content according to rank |
US6658404B1 (en) | 1999-09-20 | 2003-12-02 | Libera, Inc. | Single graphical approach for representing and merging boolean logic and mathematical relationship operators |
US20020126905A1 (en) | 2001-03-07 | 2002-09-12 | Kabushiki Kaisha Toshiba | Mathematical expression recognizing device, mathematical expression recognizing method, character recognizing device and character recognizing method |
US20100250547A1 (en) | 2001-08-13 | 2010-09-30 | Xerox Corporation | System for Automatically Generating Queries |
US20070061384A1 (en) | 2003-07-30 | 2007-03-15 | Xerox Corporation | Multi-versioned documents and method for creation and use thereof |
US20060010105A1 (en) | 2004-07-08 | 2006-01-12 | Sarukkai Ramesh R | Database search system and method of determining a value of a keyword in a search |
US20060074903A1 (en) | 2004-09-30 | 2006-04-06 | Microsoft Corporation | System and method for ranking search results using click distance |
US20060195427A1 (en) | 2005-02-25 | 2006-08-31 | International Business Machines Corporation | System and method for improving query response time in a relational database (RDB) system by managing the number of unique table aliases defined within an RDB-specific search expression |
US20070022099A1 (en) | 2005-04-12 | 2007-01-25 | Fuji Xerox Co., Ltd. | Question answering system, data search method, and computer program |
US20070288438A1 (en) | 2006-06-12 | 2007-12-13 | Zalag Corporation | Methods and apparatuses for searching content |
US20080033938A1 (en) | 2006-08-03 | 2008-02-07 | Kabushiki Kaisha Toshiba | Keyword outputting apparatus, keyword outputting method, and keyword outputting computer program product |
US20080065685A1 (en) | 2006-08-04 | 2008-03-13 | Metacarta, Inc. | Systems and methods for presenting results of geographic text searches |
US20080235187A1 (en) | 2007-03-23 | 2008-09-25 | Microsoft Corporation | Related search queries for a webpage and their applications |
US20090012946A1 (en) | 2007-07-02 | 2009-01-08 | Sony Corporation | Information processing apparatus, and method and system for searching for reputation of content |
US20090119258A1 (en) | 2007-11-05 | 2009-05-07 | William Petty | System and method for content ranking and reviewer selection |
KR100913049B1 (en) | 2008-01-29 | 2009-08-20 | 엔에이치엔(주) | Method and system for providing positive / negative search result using user preference |
US20130103662A1 (en) | 2011-10-19 | 2013-04-25 | Zalag Corporation | Methods and apparatuses for generating search expressions from content, for applying search expressions to content collections, and/or for analyzing corresponding search results |
Non-Patent Citations (5)
Title |
---|
Final Office Action mailed Jul. 9, 2014 for U.S. Appl. No. 13/726,778, 26 pages. |
Non-Final Office Action mailed Dec. 31, 2013 for U.S. Appl. No. 13/726,778, 31 pages. |
Non-Final Office Action mailed Jul. 2, 2013 for U.S. Appl. No. 13/726,778, 23 pages. |
Non-Final Office Action mailed Mar. 4, 2015 for U.S. Appl. No. 13/726,778, 23 pages. |
PCT International Search Report and Written Opinion for PCT/US2012/058795, mailed Jan. 30, 2013, 10 pages. |
Also Published As
Publication number | Publication date |
---|---|
US20160070807A1 (en) | 2016-03-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9208218B2 (en) | Methods and apparatuses for generating search expressions from content, for applying search expressions to content collections, and/or for analyzing corresponding search results | |
US9600587B2 (en) | Methods and apparatuses for generating search expressions from content, for applying search expressions to content collections, and/or for analyzing corresponding search results | |
Ji et al. | Microsoft concept graph: Mining semantic concepts for short text understanding | |
Chowdhury et al. | Plagiarism: Taxonomy, tools and detection techniques | |
US12260343B2 (en) | Methods and apparatus for identifying concepts corresponding to input information | |
Fernández et al. | Semantically enhanced information retrieval: An ontology-based approach | |
US8751218B2 (en) | Indexing content at semantic level | |
Galitsky | Machine learning of syntactic parse trees for search and classification of text | |
US20160034514A1 (en) | Providing search results based on an identified user interest and relevance matching | |
Im et al. | Linked tag: image annotation using semantic relationships between image tags | |
Asgari-Bidhendi et al. | Farsbase: The persian knowledge graph | |
Jabeen et al. | Semantics discovery in social tagging systems: A review | |
Zheng et al. | Data extraction from web pages based on structural-semantic entropy | |
Coste et al. | Advances in clickbait and fake news detection using new language-independent strategies | |
Singhal et al. | Research dataset discovery from research publications using web context | |
Nkongolo Wa Nkongolo | News classification and categorization with smart function sentiment analysis | |
Stylios et al. | Using Bio-inspired intelligence for Web opinion Mining | |
Xu et al. | Measuring semantic relatedness between flickr images: from a social tag based view | |
Spahiu et al. | Topic profiling benchmarks in the linked open data cloud: Issues and lessons learned | |
Finin et al. | Creating and exploiting a web of semantic data | |
Henriksen et al. | SemTex: A Hybrid Approach for Semantic Table Interpretation | |
CN101310274A (en) | A knowledge correlation search engine | |
Jabeen et al. | Quality-protected folksonomy maintenance approaches: a brief survey | |
Al-Akashi | Using Wikipedia Knowledge and Query Types in a New Indexing Approach for Web Search Engines | |
Liu | Translation of news reports related to COVID-19 of Japanese Linguistics based on page link mining |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: ZALAG CORPORATION, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:EPSTEIN, SAMUEL S.;REEL/FRAME:036944/0074 Effective date: 20151030 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: EPSTEIN, SAMUEL S., NEW YORK Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ZALAG CORPORATION;REEL/FRAME:046283/0792 Effective date: 20171130 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YR, SMALL ENTITY (ORIGINAL EVENT CODE: M2551); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY Year of fee payment: 4 |
|
AS | Assignment |
Owner name: BABAYAN, LANA IVANOVNA, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:EPSTEIN, SAMUEL S.;REEL/FRAME:062364/0077 Effective date: 20230112 |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
FEPP | Fee payment procedure |
Free format text: 7.5 YR SURCHARGE - LATE PMT W/IN 6 MO, SMALL ENTITY (ORIGINAL EVENT CODE: M2555); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YR, SMALL ENTITY (ORIGINAL EVENT CODE: M2552); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY Year of fee payment: 8 |