US10482167B2 - Crowd-source as a backup to asynchronous identification of a type of form and relevant fields in a credential-seeking web page - Google Patents
Crowd-source as a backup to asynchronous identification of a type of form and relevant fields in a credential-seeking web page Download PDFInfo
- Publication number
- US10482167B2 US10482167B2 US14/864,448 US201514864448A US10482167B2 US 10482167 B2 US10482167 B2 US 10482167B2 US 201514864448 A US201514864448 A US 201514864448A US 10482167 B2 US10482167 B2 US 10482167B2
- Authority
- US
- United States
- Prior art keywords
- field
- form information
- crowd
- web page
- web
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G06F17/241—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
- G06F40/169—Annotation, e.g. comment data or footnotes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G06F17/243—
-
- G06F17/2725—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
- G06F40/174—Form filling; Merging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/226—Validation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
-
- H04L67/2852—
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/568—Storing data temporarily at an intermediate stage, e.g. caching
- H04L67/5682—Policies or rules for updating, deleting or replacing the stored data
Definitions
- Embodiments described herein generally relate to client and server networks and, more particularly, to determining a location of enrollment fields in and the type of form, in a credential seeking web page, by using crowd-sourced information.
- Each web-based account (referred to herein as a web account) requires a user to provide a username, a password, and/or other user credentials in, for example, a web browser to provide access to the web account.
- Each web account may present, in a web page, a web form to the user during initial login and subsequent access to the web account.
- This web form is a structured document that includes “form fields” for entering user identifier or credential information, such as a user ID (a user identifier), a password, or the like.
- credential manager applications that that provide the ability to store user credentials and later be used for logging a user into the user's online accounts using web pages received over the internet. These applications log the user into the online account by entering user credentials in one or more fields in a web form that is received in the web page.
- web forms in web pages may be changed and location of enrollment fields in previous web pages may not be located in similar locations in the new web pages. Therefore, user credentials that are stored on a user device cannot be used in new web pages.
- a way of determining a location of enrollment fields in a credential-seeking web page would be desirable.
- FIG. 1 is a diagram illustrating a system for identifying enrollments fields using crowd-sourced data according to one embodiment.
- FIG. 2 is a flowchart illustrating a technique for identifying fields and filling fields in a web page by a credential manager application according to one embodiment.
- FIG. 3 is a flowchart illustrating a crowd-source assisted technique for identifying fields in a web page by a server according to one embodiment.
- FIG. 4 is a diagram illustrating a computing device for use with techniques described herein according to one embodiment.
- FIG. 5 is a block diagram illustrating a computing device for use with techniques described herein according to another embodiment.
- FIG. 6 is a diagram illustrating a network of programmable devices according to one embodiment.
- computer system can refer to a single computer or a plurality of computers working together to perform the function described as being performed on or by a computer system.
- the term “medium” can refer to a single physical medium or a plurality of media that together store the information described as being stored on the medium.
- web crawler can refer to an automated program, or script, that methodically scans or “crawls” through Internet pages to create an index of the data the web crawler is looking for.
- program There are several uses for the program, perhaps the most popular being search engines using it to provide webs surfers with relevant websites.
- a web crawler can also be referred to as a web spider, a web robot, a bot, a crawler, and an automatic indexer.
- headless browser can refer to a web browser without a graphical user interface (GUI) that can access web pages over the internet but does not display them in the GUI on a client.
- GUI graphical user interface
- a technique allows a credential manager application on a client computer system to identify fields and forms on a web page.
- An analysis server may automatically crawl web pages and identify the fields and form, then push the information to the client computer system for use by the credential manager. If the credential manager discovers the information is not available, the credential manager may analyze the web form to discover the fields and form information, then provide the discovered information to the analysis server for providing to other client computers.
- the analysis server may use crowd-sourcing for asynchronous verification of field and form information discovered by the analysis server or provided by the client computer.
- FIG. 1 illustrates a system 100 according to one embodiment.
- System 100 may include a web/content server 102 , analysis server 104 , network 106 , client 108 , and crowd-sourcing server 110 . While a single element of each type is illustrated in FIG. 1 for clarity, any number of each of the elements may be used as desired.
- Web server 102 is a server that communicates one or more web pages to client 108 via network 106 .
- Web server 102 may transmit one or more HyperText Markup Language (HTML) web pages with web forms (also referred to as “login forms”) having enrollment fields to a web browser 112 on client 108 in response to a Hypertext Transfer Protocol (HTTP) request, for example, a HTTP GET request from client 108 .
- Web pages that are received from web server 102 may also include HTML text and Cascading Style Sheets (CSS) data.
- the web server 102 may also transmit web pages to analysis server 104 for use by analysis server 104 in determining the location of enrollment fields and other fields of web forms in web pages, as will be described below.
- Analysis server 104 may also include a database or knowledge base 105 , for storing information that includes information about fields and forms associated with web pages that has been discovered either by the analysis server 104 or by the client 108 .
- a database or knowledge base no specific format or type of data storage functionality is intended by the term, and any desired or convenient way of storing the field and form information may be used.
- the database 105 is connected directly to the analysis server 104 , embodiments may provide the database 105 as a remotely hosted database connected to a database server or other computer system that is then connected via the network 106 to the analysis server 104 .
- the network 106 may be a single network or a collection of interconnected networks, and different elements of the system 100 may be connected to different ones of the interconnected networks.
- the network 106 may include the Internet as well as private or other public networks of any type, using any type of network protocol for communication, including Internet Protocol (IP) network protocols.
- IP Internet Protocol
- the crowd-sourcing server 110 provides a crowd-sourcing functionality for use by the analysis server 104 .
- Crowd-sourcing server 110 typically provides a way for small tasks to be performed under conditions defined by a crowd-sourcing client (in this case, the analysis server 104 ). Individuals typically may agree to perform the tasks for a small payment. These individuals are generally not employed by the entity controlling the crowd-sourcing server 110 , but may be otherwise unconnected individuals who have enrolled with the crowd-sourcing service provider.
- the crowd-sourcing service is generally provided by a third-party service provider, although the entity controlling the analysis server may provide similar services using in-house facilities.
- a crowd-sourcing service provider is Amazon Technologies, Inc., which provides the Amazon Mechanical Turk® crowd-sourcing service.
- the analysis server 104 typically has no information about the individuals who perform the requested tasks. As deployed in FIG. 1 and described in more detail below, simple questions with Yes-No answers are asked as part of the crowd-sourcing effort submitted to the crowd-sourcing server 110 .
- the analysis server 104 populates the database 105 with information regarding web pages and their respective forms and fields.
- the information stored in the database 105 may be sent from the client 108 and received by the analysis server 104 via the network 106 or may be generated by the analysis server 104 as a result of its web crawling functionality, described in more detail below.
- the client 108 may be any type of programmable device, including any type of computer system, such as desktop, laptop, tablet, or other mobile device, and includes elements of typical computer systems, such as are described below in the description of FIGS. 5-6 .
- the client 108 typically includes a web browser software 112 , containing an HTML engine 114 and an HTML parser 116 , which provide the functionality used for requesting, receiving, parsing, and displaying the web pages requested by the client 108 .
- a credential manager application 118 provides secure management for credentials such as password, although other information such as credit card numbers may also be managed by the credential manager application 118 .
- the operation of a credential manager application is generally outside the scope of the present invention and is described herein only for specific functionality of concern to the present disclosure.
- the client 108 may also maintain a cache 120 for caching information such as the field and form information used by the credential manager 118 .
- the structure and implementation of the cache is not significant to the current disclosure, and any type of caching functionality may be used.
- the caching functionality 120 is typically non-volatile, such that the cached contents may survive a shutdown and restart of the client 108 , but volatile cache storage techniques may be used as desired, such that a restart of the client 108 or the credential manager 118 may flush the cache 120 .
- Entries in the cache 120 may information about the web page, the web server 102 , fields and form information, and any other desired information. In one embodiment, timestamps or other fields may be provided to allow any or all of the cache entries as invalid and not to be used.
- the credential manager may mark entries associated with a web page as invalid upon discover that the web page has changed since the entry was created.
- the analysis server 104 may push cache entries to the client 108 for insertion into the cache 120 , and may also instruct the credential manager 118 to mark one or more cache entries as invalid.
- FIG. 2 is a flowchart illustrating a technique for a credential manager application 118 to use field and form information obtained from the analysis server 104 for inserting credentials into a web form of a web page served by web server 102 .
- the client 108 loads the web page from the web server 102 .
- the credential manager 118 now tries to determine where the appropriate fields (if any) are on the web page for submission as a web form.
- the credential manager 118 checks the cache 120 to see if there are entries for the current web page. If any cache 120 entries are found in block 230 that are not marked invalid, the credential manager may then determine in block 260 whether the field and form information in those entries matches the actual current web page. If so, then in block 290 the credential manager 118 may use those entries to insert credentials into one or more of the fields of the web form on the web page.
- the credential manager 118 may query the analysis server 104 for any field and form information the analysis server 104 may hold related to the current web page, causing the analysis server 104 to search the database 105 . If the client 108 fails to receive any field or form information in the current web page from the analysis server 104 , as determined in block 250 , in block 270 the credential manager 118 may analyze the web page to locate fields and forms in the current web page. The credential manager may then provide the discovered fields and forms to the analysis server 104 in block 280 , who may choose to push that newly discovered field and form information out to other clients 108 .
- the credential manager may use the field and form information to insert credentials into the web page in block 290 .
- the credential manager 118 receives field and form information for the current web page from the analysis server 104 as determined in block 250 , then the field and form information is checked in block 260 as described above, to determine whether the field and form information from the analysis server 104 matches the current web page. In no match exists, then the credential manager may proceed as if no field and form information was received in block 270 , searching for fields in the web page. Not illustrated in FIG. 2 , the cache 120 may at any time be updated by the analysis server 104 for use in the illustrated procedure.
- FIG. 3 is a flowchart illustrating an asynchronous technique for the analysis server 104 to discover fields and form information in web pages.
- the analysis server 104 may include other techniques to automatically discover field and form information.
- the analysis server employs a web crawler to crawl the web for web pages that may have forms for submitting credentials. Web crawling techniques are well known and are not further described herein.
- the web crawler employed by the analysis server 104 may limit the crawling to web pages identified by a third party web page ranking resource as highly popular web pages. The analysis server may then prefill the cache 120 of clients 108 with information regarding such popular web pages, pushing the information via the network 106 to the clients 108 . This crawling technique is asynchronous to the client 108 's activity.
- FIG. 3 illustrates a procedure for verifying that field and form information, whether received from client 108 or asynchronously discovered by web crawling.
- the analysis server may receive the field and form information, either from the client 108 or the web crawler.
- the analysis server may obtain the page and create a screenshot showing what a user would see on the screen if that web page were displayed by a browser. In one embodiment, this may be achieved by executing a headless browser to format the data that would be displayed, however without an actual display.
- the screen shot may then be annotated to mark the position of the fields corresponding to the field and form information, in some embodiments identifying the type of field, such as whether the field is a password field. Any type of visual marking may be used, such as surrounding the field with a border of a contrasting color.
- the annotated screenshot may be sent to the crowd-sourcing service provider via server 110 , with a request for crowd-sourced validation of the fields.
- the server 110 is requested to have 3 individuals review the annotated screenshot and respond with a simple yes/no answer to a question of whether the fields are correctly identified and marked.
- the question may be communicated in any desired way, including either separately from the screenshot or contained in the screenshot.
- a positive result comprises all three of the crowd sourcers responding Yes to the question, saying that the fields and form information are correct as they are held by the analysis server 104 and sent to the crowd sourcer by the crowd sourcing server 110 .
- a negative response comprises no more than one of the three crowd sourcers voting Yes, and a mixed response comprises two of the three voting Yes, but one of the three voting No.
- Using three crowd sourcers is illustrative and by way of example only, and any number may be used.
- the field and form information may be accepted and pushed out to the caches 120 for use by the credential managers 118 on the clients 108 .
- the screenshot's annotation of the web page is considered incorrect and the screenshot may be presented to another human being in block 370 to allow the human being to generate a new set of field and form information in block 372 .
- the new and presumably corrected information may then be pushed to the user caches 120 in block 392 .
- the screenshot may be sent to another human being in block 355 for a final decision on the validity of the field and form annotations of the web page. If the human arbiter's decision is positive, as determined in block 380 , then the procedure accepts the field and form information as valid and pushes the information to clients 108 in block 392 . Similarly, if the human arbiter's decision is negative, the actions of block 370 , 372 , and 392 may be performed as described above.
- annotated screenshots instead of or in addition to sending the annotated screenshots to a crowd-sourcing service provider at server 110 , other techniques may be used.
- computer vision techniques may be used to consider the annotated screenshots and use machine learning techniques directly in either a headless browser or a GUI browser, with or without marked fields to generate a decision on whether the field and form information is correct, given the decision a confidence level.
- a positive decision would then be a decision with a high confidence level that the field and form information is correct; a negative decision would be a decision with a high confidence level that the field and form information is incorrect, and a mixed decision would be a decision with a lower confidence level.
- a high confidence level that the field and form information is correct may be a confidence level that exceeds a first predetermined threshold
- a high confidence level that the field is incorrect may be a confidence level that is lower than a second predetermined threshold
- a mixed decision may be a confidence level between the first and second predetermined thresholds.
- the processing of the computer-vision guided embodiment may then follow the procedure of blocks 350 - 392 of FIG. 3 .
- Other embodiments may combine computer vision and crowd sourcing or use other automatic or semi-automatic techniques for verifying or validating the discovered field and form information.
- crowd-sourcing may be used as a backup to asynchronously discovering field and form information by the analysis server 104 , providing a better user experience for the user of client 108 , by detecting erroneous decisions about field and form information.
- These techniques are scalable, because any number of analysis servers may be used for performing the web crawling and backend analysis, and because any number of crowd sources may be employed by the crowd sourcing service provider without the need for the analysis server provider to hire dedicated staff to review and make decisions on field and form information at the level that would be needed to review very large numbers of web pages.
- the analysis server 104 may instruct the clients 108 to invalidate the corresponding entry in the cache 120 .
- the analysis procedure illustrated in FIG. 3 determines that the field and form information is incorrect, the analysis server 104 may instruct the clients 108 to invalidate the corresponding entry in the cache 120 before pushing the new field and form information to the clients 108 for storing in the cache.
- FIG. 4 a block diagram illustrates a programmable device 400 that may be used as the analysis server 104 or the client 108 in accordance with one embodiment.
- the programmable device 400 illustrated in FIG. 4 is a multiprocessor programmable device that includes a first processing element 470 and a second processing element 480 . While two processing elements 470 and 480 are shown, an embodiment of programmable device 400 may also include only one such processing element.
- Programmable device 400 is illustrated as a point-to-point interconnect system, in which the first processing element 470 and second processing element 480 are coupled via a point-to-point interconnect 450 .
- Any or all of the interconnects illustrated in FIG. 4 may be implemented as a multi-drop bus rather than point-to-point interconnects.
- each of processing elements 470 and 480 may be multicore processors, including first and second processor cores (i.e., processor cores 474 a and 474 b and processor cores 484 a and 484 b ).
- Such cores 474 a , 474 b , 484 a , 484 b may be configured to execute instruction code in a manner similar to that discussed above in connection with FIGS. 1-3 .
- other embodiments may use processing elements that are single core processors as desired.
- each processing element may be implemented with different numbers of cores as desired.
- Each processing element 470 , 480 may include at least one shared cache 446 .
- the shared cache 446 a , 446 b may store data (e.g., instructions) that are utilized by one or more components of the processing element, such as the cores 474 a , 474 b and 484 a , 484 b , respectively.
- the shared cache may locally cache data stored in a memory 432 , 434 for faster access by components of the processing elements 570 , 580 .
- the shared cache 446 a , 446 b may include one or more mid-level caches, such as level 2 (L2), level 3 (L3), level 4 (L4), or other levels of cache, a last level cache (LLC), or combinations thereof.
- L2 level 2
- L3 level 3
- L4 level 4
- LLC last level cache
- FIG. 4 illustrates a programmable device with two processing elements 470 , 480 for clarity of the drawing
- processing elements 470 , 480 may be an element other than a processor, such as an graphics processing unit (GPU), a digital signal processing (DSP) unit, a field programmable gate array, or any other programmable processing element.
- Processing element 480 may be heterogeneous or asymmetric to processing element 470 .
- the various processing elements 470 , 480 may reside in the same die package.
- First processing element 470 may further include memory controller logic (MC) 472 and point-to-point (P-P) interconnects 476 and 478 .
- second processing element 480 may include a MC 482 and P-P interconnects 486 and 488 .
- MCs 472 and 482 couple processing elements 470 , 480 to respective memories, namely a memory 432 and a memory 434 , which may be portions of main memory locally attached to the respective processors.
- MC logic 472 and 482 is illustrated as integrated into processing elements 470 , 480 , in some embodiments the memory controller logic may be discrete logic outside processing elements 470 , 480 rather than integrated therein.
- I/O subsystem 490 may be coupled to a first link 416 via an interface 496 .
- first link 416 may be a Peripheral Component Interconnect (PCI) bus, or a bus such as a PCI Express bus or another I/O interconnect bus, although the scope of the present invention is not so limited.
- PCI Peripheral Component Interconnect
- various I/O devices 414 , 424 may be coupled to first link 416 , along with a bridge 418 that may couple first link 416 to a second link 410 .
- second link 420 may be a low pin count (LPC) bus.
- Various devices may be coupled to second link 420 including, for example, a keyboard/mouse 412 , communication device(s) 426 (which may in turn be in communication with the computer network 403 ), and a data storage unit 428 such as a disk drive or other mass storage device which may include code 430 , in one embodiment.
- the code 430 may include instructions for performing embodiments of one or more of the techniques described above.
- an audio I/O 424 may be coupled to second link 420 .
- a system may implement a multi-drop bus or another such communication topology.
- links 416 and 420 are illustrated as busses in FIG. 4 , any desired type of link may be used.
- the elements of FIG. 4 may alternatively be partitioned using more or fewer integrated chips than illustrated in FIG. 4 .
- FIG. 5 a block diagram illustrates a programmable device 500 according to another embodiment. Certain aspects of FIG. 5 have been omitted from FIG. 5 in order to avoid obscuring other aspects of FIG. 5 .
- FIG. 5 illustrates that processing elements 570 , 580 may include integrated memory and I/O control logic (“CL”) 572 and 582 , respectively.
- the 572 , 582 may include memory control logic (MC) such as that described above in connection with FIG. 5 .
- CL 572 , 582 may also include I/O control logic.
- FIG. 5 illustrates that not only may the memories 532 , 534 be coupled to the 572 , 582 , but also that I/O devices 544 may also be coupled to the control logic 572 , 582 .
- Legacy I/O devices 515 may be coupled to the I/O subsystem 590 by interface 596 .
- Each processing element 570 , 580 may include multiple processor cores, illustrated in FIG.
- I/O subsystem 590 includes point-to-point (P-P) interconnects 594 and 598 that connect to P-P interconnects 576 and 586 of the processing elements 570 and 580 with links 552 and 554 .
- P-P point-to-point
- Processing elements 570 and 580 may also be interconnected by link 550 and interconnects 578 and 588 , respectively.
- FIGS. 4 and 5 are schematic illustrations of embodiments of programmable devices that may be utilized to implement various embodiments discussed herein. Various components of the programmable devices depicted in FIGS. 4 and 5 may be combined in a system-on-a-chip (SoC) architecture.
- SoC system-on-a-chip
- Infrastructure 600 contains computer networks 602 .
- Computer networks 602 may include many different types of computer networks available today, such as the Internet, a corporate network or a Local Area Network (LAN). Each of these networks can contain wired or wireless programmable devices and operate using any number of network protocols (e.g., TCP/IP).
- Networks 602 may be connected to gateways and routers (represented by 608 ), end user computers 606 , and computer servers 604 .
- Infrastructure 600 also includes cellular network 603 for use with mobile communication devices. Mobile cellular networks support mobile phones and many other types of mobile devices. Mobile devices in the infrastructure 600 are illustrated as mobile phones 610 , laptops 612 and tablets 614 .
- Example 2 the subject matter of Example 1 optionally includes wherein the instructions further comprise instructions that when executed cause the machine to instruct the credential manager application to invalidate some or all of a cache maintained by the credential manager application.
- Example 3 the subject matter of Example 1 optionally includes wherein the instructions further comprise instructions that when executed cause the machine to send the field and form information to a human arbiter responsive to a mixed result from the crowd-sourcing service.
- Example 4 the subject matter of Examples 1-3 optionally includes wherein the instructions to validate the field and form information using a crowd-sourcing service comprise instructions that when executed cause the machine to: generate a screenshot of the web page; mark the fields on the screenshot; and send the screenshot to the crowd-sourcing service for validation.
- Example 7 the subject matter of Example 6 optionally includes wherein the instructions further comprise instructions that when executed cause the machine to: pass the screenshot to a human reviewer responsive to the confidence level being lower than a predetermined threshold.
- Example 8 the subject matter of Examples 1-3 optionally includes wherein the instructions that when executed cause the machine to receive the web page comprise instructions that when executed cause the machine to: employ a web crawler for examining web pages.
- Example 9 the subject matter of Example 8 optionally includes wherein the web crawler is configured to crawl a predetermined set of popular web pages.
- Example 10 is a computer system for determining web form information in a web page for a web site comprising: one or more processors; and a memory coupled to the one or more processors, on which are stored instructions, comprising instructions that when executed cause at least some of the one or more of the processors to: receive a web page for a web site over a network by an analysis server; discover field and form information for the web page; validate the field and form information using a crowd-sourcing service; accept the field and form information as validated field and form information responsive to a positive result from the crowd-sourcing service; receive a corrected field and form information from a human reviewer responsive to a negative result from the crowd-sourcing service; and send the validated or corrected field and form information from the analysis server to a credential manager application.
- Example 11 the subject matter of Example 10 optionally includes wherein the instructions further comprise instructions that when executed cause at least some of the one or more processors to instruct the credential manager application to invalidate some or all of a cache maintained by the credential manager application.
- Example 13 the subject matter of Examples 10-12 optionally includes wherein the instructions to validate the field and form information using a crowd-sourcing service comprise instructions that when executed cause at least some of the one or more processors to: generate a screenshot of the web page; mark the fields on the screenshot; and send the screenshot to the crowd-sourcing service for validation.
- Example 15 the subject matter of Examples 10-12 optionally includes wherein the instructions further comprise instructions that when executed cause at least some of the one or more processors to: use computer vision to view a screenshot of the web page annotated corresponding to the field and form information; and return a decision based on computer vision that indicates a confidence level that the field and form information is correct.
- Example 16 the subject matter of Example 15 optionally includes wherein the instructions further comprise instructions that when executed cause at least some of the one or more processors to: pass the screenshot to a human reviewer responsive to the confidence level being lower than a predetermined threshold.
- Example 17 the subject matter of Examples 10-12 optionally includes wherein the instructions that when executed cause at least some of the one or more processors to receive the web page comprise instructions that when executed cause at least some of the one or more processors to: employ a web crawler for examining web pages.
- Example 18 the subject matter of Example 17 optionally includes wherein the web crawler is configured to crawl a predetermined set of popular web pages.
- Example 19 is a method for determining web form information in a web page for a web site, comprising: receiving a web page for a web site over a network by an analysis server; discovering field and form information for the web page; validating the field and form information using a crowd-sourcing service; accepting the field and form information as validated field and form information responsive to a positive result from the crowd-sourcing service; receiving a corrected field and form information from a human reviewer responsive to a negative result from the crowd-sourcing service; and sending the validated or corrected field and form information from the analysis server to a credential manager application.
- Example 20 the subject matter of Example 19 optionally includes further comprising instructing the credential manager application to invalidate some or all of a cache maintained by the credential manager application.
- Example 21 the subject matter of Example 19 optionally includes further comprising sending the field and form information to a human arbiter responsive to a mixed result from the crowd-sourcing service.
- Example 22 the subject matter of Examples 19-21 optionally includes wherein validating the field and form information using a crowd-sourcing service comprises: generating a screenshot of the web page; marking the fields on the screenshot; and sending the screenshot to the crowd-sourcing service for validation.
- Example 23 the subject matter of Examples 19-21 optionally includes wherein validating the field and form information using a crowd-sourcing service comprises: requesting at least three crowd-sources responses from the crowd-sourcing service; and considering the at least three crowd-source responses as a voting result.
- Example 24 the subject matter of Examples 19-21 optionally includes further comprising: using computer vision to view a screenshot of the web page annotated corresponding to the field and form information; and return a decision based on computer vision that indicates a confidence level that the field and form information is correct.
- Example 25 the subject matter of Examples 19-21 optionally includes wherein receiving the web page comprise: employing a web crawler for examining web pages.
- Example 26 is a computer system, comprising: means for receiving a web page for a web site over a network by an analysis server; means for discovering field and form information for the web page; means for validating the field and form information using a crowd-sourcing service; means for accepting the field and form information as validated field and form information responsive to a positive result from the crowd-sourcing service; means for receiving a corrected field and form information from a human reviewer responsive to a negative result from the crowd-sourcing service; and means for sending the validated or corrected field and form information from the analysis server to a credential manager application.
- Example 27 the subject matter of Example 26 optionally includes further means for instructing the credential manager application to invalidate some or all of a cache maintained by the credential manager application.
- Example 28 the subject matter of Example 26 optionally includes further comprising sending the field and form information to a human arbiter responsive to a mixed result from the crowd-sourcing service.
- Example 29 the subject matter of Examples 26-28 optionally includes wherein the means for validating the field and form information using a crowd-sourcing service comprise: means for generating a screenshot of the web page; means for marking the fields on the screenshot; and means for sending the screenshot to the crowd-sourcing service for validation.
- Example 30 the subject matter of Examples 26-28 optionally includes wherein the means for validating the field and form information using a crowd-sourcing service comprise: means for requesting at least three crowd-sources responses from the crowd-sourcing service; and means for considering the at least three crowd-source responses as a voting result.
- Example 31 the subject matter of Examples 26-28 optionally includes further comprising: means for using computer vision to view a screenshot of the web page annotated corresponding to the field and form information; and means for returning a decision based on computer vision that indicates a confidence level that the field and form information is correct.
- Example 32 the subject matter of Example 31 optionally includes further comprising: means for passing the screenshot to a human reviewer responsive to the confidence level being lower than a predetermined threshold.
- Example 33 the subject matter of Examples 26-28 optionally includes wherein the means for receiving the web page comprise: means for employing a web crawler for examining web pages.
- Example 34 the subject matter of Example 33 optionally includes wherein the web crawler is configured to crawl a predetermined set of popular web pages.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Data Mining & Analysis (AREA)
- Information Transfer Between Computers (AREA)
Abstract
Description
Claims (19)
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/864,448 US10482167B2 (en) | 2015-09-24 | 2015-09-24 | Crowd-source as a backup to asynchronous identification of a type of form and relevant fields in a credential-seeking web page |
PCT/US2016/053164 WO2017053602A1 (en) | 2015-09-24 | 2016-09-22 | Crowd-source as a backup to asynchronous identification of a type of form and relevant fields in a credential-seeking web page |
US16/687,248 US11055480B2 (en) | 2015-09-24 | 2019-11-18 | Crowd-source as a backup to asynchronous identification of a type of form and relevant fields in a credential-seeking web page |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/864,448 US10482167B2 (en) | 2015-09-24 | 2015-09-24 | Crowd-source as a backup to asynchronous identification of a type of form and relevant fields in a credential-seeking web page |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/687,248 Continuation US11055480B2 (en) | 2015-09-24 | 2019-11-18 | Crowd-source as a backup to asynchronous identification of a type of form and relevant fields in a credential-seeking web page |
Publications (2)
Publication Number | Publication Date |
---|---|
US20170091163A1 US20170091163A1 (en) | 2017-03-30 |
US10482167B2 true US10482167B2 (en) | 2019-11-19 |
Family
ID=58387489
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/864,448 Active 2036-04-12 US10482167B2 (en) | 2015-09-24 | 2015-09-24 | Crowd-source as a backup to asynchronous identification of a type of form and relevant fields in a credential-seeking web page |
US16/687,248 Active US11055480B2 (en) | 2015-09-24 | 2019-11-18 | Crowd-source as a backup to asynchronous identification of a type of form and relevant fields in a credential-seeking web page |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/687,248 Active US11055480B2 (en) | 2015-09-24 | 2019-11-18 | Crowd-source as a backup to asynchronous identification of a type of form and relevant fields in a credential-seeking web page |
Country Status (2)
Country | Link |
---|---|
US (2) | US10482167B2 (en) |
WO (1) | WO2017053602A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11055480B2 (en) | 2015-09-24 | 2021-07-06 | Mcafee, Llc | Crowd-source as a backup to asynchronous identification of a type of form and relevant fields in a credential-seeking web page |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10878057B2 (en) * | 2019-02-04 | 2020-12-29 | Citrix Systems, Inc. | Web application with custom form components |
US11893620B2 (en) * | 2020-12-18 | 2024-02-06 | The Yes Platform, Inc. | Order management systems and methods |
Citations (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6606663B1 (en) * | 1998-09-29 | 2003-08-12 | Openwave Systems Inc. | Method and apparatus for caching credentials in proxy servers for wireless user agents |
US20040205618A1 (en) * | 2001-11-19 | 2004-10-14 | Jean Sini | Runtime translator for mobile application content |
US20080184100A1 (en) | 2007-01-30 | 2008-07-31 | Oracle International Corp | Browser extension for web form fill |
US20080313529A1 (en) | 2007-06-15 | 2008-12-18 | Microsoft Corporation | Increasing accuracy in determining purpose of fields in forms |
US20090157557A1 (en) * | 2001-03-15 | 2009-06-18 | American Express Travel Related Services Company, Inc. | Merchant system facilitating an online card present transaction |
US20100082662A1 (en) * | 2008-09-25 | 2010-04-01 | Microsoft Corporation | Information Retrieval System User Interface |
US20120072253A1 (en) * | 2010-09-21 | 2012-03-22 | Servio, Inc. | Outsourcing tasks via a network |
US20120166464A1 (en) | 2010-12-27 | 2012-06-28 | Nokia Corporation | Method and apparatus for providing input suggestions |
US20120265574A1 (en) * | 2011-04-12 | 2012-10-18 | Jana Mobile, Inc. | Creating incentive hierarchies to enable groups to accomplish goals |
US20120265573A1 (en) * | 2011-03-23 | 2012-10-18 | CrowdFlower, Inc. | Dynamic optimization for data quality control in crowd sourcing tasks to crowd labor |
US20130197954A1 (en) * | 2012-01-30 | 2013-08-01 | Crowd Control Software, Inc. | Managing crowdsourcing environments |
US20130275803A1 (en) * | 2012-04-13 | 2013-10-17 | International Business Machines Corporation | Information governance crowd sourcing |
US8682674B1 (en) | 2004-06-18 | 2014-03-25 | Glenbrook Networks | System and method for facts extraction and domain knowledge repository creation from unstructured and semi-structured documents |
US20140173405A1 (en) | 2012-12-19 | 2014-06-19 | Google Inc. | Using custom functions to dynamically manipulate web forms |
US8869022B1 (en) * | 2010-06-22 | 2014-10-21 | Intuit Inc. | Visual annotations and spatial maps for facilitating application use |
US20140317678A1 (en) * | 2013-04-22 | 2014-10-23 | Microsoft Corporation | Policy enforcement by end user review |
US20140380141A1 (en) | 2013-03-14 | 2014-12-25 | Goformz, Inc. | System and method for converting paper forms to an electronic format |
US9218364B1 (en) * | 2011-01-28 | 2015-12-22 | Yahoo! Inc. | Monitoring an any-image labeling engine |
US9767262B1 (en) * | 2011-07-29 | 2017-09-19 | Amazon Technologies, Inc. | Managing security credentials |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008141425A1 (en) * | 2007-05-17 | 2008-11-27 | Fat Free Mobile Inc. | Method and system for aggregate web site database price watch feature |
US8429734B2 (en) * | 2007-07-31 | 2013-04-23 | Symantec Corporation | Method for detecting DNS redirects or fraudulent local certificates for SSL sites in pharming/phishing schemes by remote validation and using a credential manager and recorded certificate attributes |
US7782861B2 (en) | 2008-07-08 | 2010-08-24 | Harris Corporation | Configuration and alignment tool for computer network radio equipment |
US20130198598A1 (en) * | 2012-01-18 | 2013-08-01 | OneID Inc. | Secure population of form data |
US20130346128A1 (en) * | 2012-06-20 | 2013-12-26 | Epiq Ediscovery Solutions, Inc. | System and Method of Reviewing and Producing Documents |
US20140015749A1 (en) * | 2012-07-10 | 2014-01-16 | University Of Rochester, Office Of Technology Transfer | Closed-loop crowd control of existing interface |
CN104507440A (en) * | 2012-07-26 | 2015-04-08 | 阿勒根公司 | Dual cap system for container-closures to maintain tip sterility during shelf storage |
US9461876B2 (en) * | 2012-08-29 | 2016-10-04 | Loci | System and method for fuzzy concept mapping, voting ontology crowd sourcing, and technology prediction |
US20140067451A1 (en) * | 2012-08-30 | 2014-03-06 | Xerox Corporation | Hybrid Multi-Iterative Crowdsourcing System |
US20140223284A1 (en) * | 2013-02-01 | 2014-08-07 | Brokersavant, Inc. | Machine learning data annotation apparatuses, methods and systems |
US10482167B2 (en) | 2015-09-24 | 2019-11-19 | Mcafee, Llc | Crowd-source as a backup to asynchronous identification of a type of form and relevant fields in a credential-seeking web page |
-
2015
- 2015-09-24 US US14/864,448 patent/US10482167B2/en active Active
-
2016
- 2016-09-22 WO PCT/US2016/053164 patent/WO2017053602A1/en active Application Filing
-
2019
- 2019-11-18 US US16/687,248 patent/US11055480B2/en active Active
Patent Citations (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6606663B1 (en) * | 1998-09-29 | 2003-08-12 | Openwave Systems Inc. | Method and apparatus for caching credentials in proxy servers for wireless user agents |
US20090157557A1 (en) * | 2001-03-15 | 2009-06-18 | American Express Travel Related Services Company, Inc. | Merchant system facilitating an online card present transaction |
US20040205618A1 (en) * | 2001-11-19 | 2004-10-14 | Jean Sini | Runtime translator for mobile application content |
US8682674B1 (en) | 2004-06-18 | 2014-03-25 | Glenbrook Networks | System and method for facts extraction and domain knowledge repository creation from unstructured and semi-structured documents |
US20080184100A1 (en) | 2007-01-30 | 2008-07-31 | Oracle International Corp | Browser extension for web form fill |
US20080313529A1 (en) | 2007-06-15 | 2008-12-18 | Microsoft Corporation | Increasing accuracy in determining purpose of fields in forms |
US20100082662A1 (en) * | 2008-09-25 | 2010-04-01 | Microsoft Corporation | Information Retrieval System User Interface |
US8869022B1 (en) * | 2010-06-22 | 2014-10-21 | Intuit Inc. | Visual annotations and spatial maps for facilitating application use |
US20120072253A1 (en) * | 2010-09-21 | 2012-03-22 | Servio, Inc. | Outsourcing tasks via a network |
US20120166464A1 (en) | 2010-12-27 | 2012-06-28 | Nokia Corporation | Method and apparatus for providing input suggestions |
US9218364B1 (en) * | 2011-01-28 | 2015-12-22 | Yahoo! Inc. | Monitoring an any-image labeling engine |
US20120265573A1 (en) * | 2011-03-23 | 2012-10-18 | CrowdFlower, Inc. | Dynamic optimization for data quality control in crowd sourcing tasks to crowd labor |
US20120265574A1 (en) * | 2011-04-12 | 2012-10-18 | Jana Mobile, Inc. | Creating incentive hierarchies to enable groups to accomplish goals |
US9767262B1 (en) * | 2011-07-29 | 2017-09-19 | Amazon Technologies, Inc. | Managing security credentials |
US20130197954A1 (en) * | 2012-01-30 | 2013-08-01 | Crowd Control Software, Inc. | Managing crowdsourcing environments |
US20130275803A1 (en) * | 2012-04-13 | 2013-10-17 | International Business Machines Corporation | Information governance crowd sourcing |
US20140173405A1 (en) | 2012-12-19 | 2014-06-19 | Google Inc. | Using custom functions to dynamically manipulate web forms |
US20140380141A1 (en) | 2013-03-14 | 2014-12-25 | Goformz, Inc. | System and method for converting paper forms to an electronic format |
US20140317678A1 (en) * | 2013-04-22 | 2014-10-23 | Microsoft Corporation | Policy enforcement by end user review |
Non-Patent Citations (3)
Title |
---|
International Searching Authority, "International Search Report," issued in connection with International Application No. PCT/US2016/053164, dated Dec. 27, 2016, 10 pages. |
International Searching Authority, "Written Opinion of the International Searching Authority," issued in connection with International Application No. PCT/US2016/053164, dated Dec. 27, 2016, 7 pages. |
Yuen et al., A Survey of Crowdsourcing Systems, IEEE International Conference on Privacy, Security, Risk, and Trust, and IEEE International Conference on Social Computing (2011). * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11055480B2 (en) | 2015-09-24 | 2021-07-06 | Mcafee, Llc | Crowd-source as a backup to asynchronous identification of a type of form and relevant fields in a credential-seeking web page |
Also Published As
Publication number | Publication date |
---|---|
US11055480B2 (en) | 2021-07-06 |
US20200159988A1 (en) | 2020-05-21 |
WO2017053602A1 (en) | 2017-03-30 |
US20170091163A1 (en) | 2017-03-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10262142B2 (en) | Systems and methods for advanced dynamic analysis scanning | |
US10740411B2 (en) | Determining repeat website users via browser uniqueness tracking | |
US9977892B2 (en) | Dynamically updating CAPTCHA challenges | |
US8301653B2 (en) | System and method for capturing and reporting online sessions | |
US20240012641A1 (en) | Model construction method and apparatus, and medium and electronic device | |
US20170093828A1 (en) | System and method for detecting whether automatic login to a website has succeeded | |
US10313364B2 (en) | Adaptive client-aware session security | |
US10523643B1 (en) | Systems and methods for enhanced security based on user vulnerability | |
US20070156592A1 (en) | Secure authentication method and system | |
US11140153B2 (en) | Techniques for identification of location of relevant fields in a credential-seeking web page | |
CN108268635B (en) | Method and apparatus for acquiring data | |
US11055480B2 (en) | Crowd-source as a backup to asynchronous identification of a type of form and relevant fields in a credential-seeking web page | |
US8407766B1 (en) | Method and apparatus for monitoring sensitive data on a computer network | |
US8832805B1 (en) | Verifying user information | |
US20240179139A1 (en) | Auto-Form Fill Based Website Authentication | |
US20140173693A1 (en) | Cookie Optimization | |
WO2021134873A1 (en) | Data acquisition method, related device and system thereof and storage apparatus | |
US20200036749A1 (en) | Web browser incorporating social and community features | |
US20220391843A1 (en) | Method and system for streamlining voting process | |
US11586696B2 (en) | Enhanced web browsing | |
WO2020252880A1 (en) | Reverse turing verification method and apparatus, storage medium, and electronic device | |
US11522942B2 (en) | System and method for parsing application network activity | |
US20210342413A1 (en) | Identifying code dependencies in web applications | |
Nie | An Application Programming Interface to Query Real-Time Water Data | |
CN114579953A (en) | Password determination method, password determination device, terminal and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: MCAFEE, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LUPIEN, NICOLAS;LAKHIA, MICHAEL;GAGNON-LAMONDE, HUBERT;SIGNING DATES FROM 20151111 TO 20160302;REEL/FRAME:039825/0904 |
|
AS | Assignment |
Owner name: MCAFEE, LLC, CALIFORNIA Free format text: CHANGE OF NAME AND ENTITY CONVERSION;ASSIGNOR:MCAFEE, INC.;REEL/FRAME:043665/0918 Effective date: 20161220 |
|
AS | Assignment |
Owner name: JPMORGAN CHASE BANK, N.A., NEW YORK Free format text: SECURITY INTEREST;ASSIGNOR:MCAFEE, LLC;REEL/FRAME:045055/0786 Effective date: 20170929 Owner name: MORGAN STANLEY SENIOR FUNDING, INC., MARYLAND Free format text: SECURITY INTEREST;ASSIGNOR:MCAFEE, LLC;REEL/FRAME:045056/0676 Effective date: 20170929 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: AWAITING TC RESP., ISSUE FEE NOT PAID |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: MORGAN STANLEY SENIOR FUNDING, INC., MARYLAND Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE REMOVE PATENT 6336186 PREVIOUSLY RECORDED ON REEL 045056 FRAME 0676. ASSIGNOR(S) HEREBY CONFIRMS THE SECURITY INTEREST;ASSIGNOR:MCAFEE, LLC;REEL/FRAME:054206/0593 Effective date: 20170929 Owner name: JPMORGAN CHASE BANK, N.A., NEW YORK Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE REMOVE PATENT 6336186 PREVIOUSLY RECORDED ON REEL 045055 FRAME 786. ASSIGNOR(S) HEREBY CONFIRMS THE SECURITY INTEREST;ASSIGNOR:MCAFEE, LLC;REEL/FRAME:055854/0047 Effective date: 20170929 |
|
AS | Assignment |
Owner name: MCAFEE, LLC, CALIFORNIA Free format text: RELEASE OF INTELLECTUAL PROPERTY COLLATERAL - REEL/FRAME 045055/0786;ASSIGNOR:JPMORGAN CHASE BANK, N.A., AS COLLATERAL AGENT;REEL/FRAME:054238/0001 Effective date: 20201026 |
|
AS | Assignment |
Owner name: MCAFEE, LLC, CALIFORNIA Free format text: RELEASE OF INTELLECTUAL PROPERTY COLLATERAL - REEL/FRAME 045056/0676;ASSIGNOR:MORGAN STANLEY SENIOR FUNDING, INC., AS COLLATERAL AGENT;REEL/FRAME:059354/0213 Effective date: 20220301 |
|
AS | Assignment |
Owner name: JPMORGAN CHASE BANK, N.A., AS ADMINISTRATIVE AGENT AND COLLATERAL AGENT, NEW YORK Free format text: SECURITY INTEREST;ASSIGNOR:MCAFEE, LLC;REEL/FRAME:059354/0335 Effective date: 20220301 |
|
AS | Assignment |
Owner name: JPMORGAN CHASE BANK, N.A., AS ADMINISTRATIVE AGENT, NEW YORK Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE THE PATENT TITLES AND REMOVE DUPLICATES IN THE SCHEDULE PREVIOUSLY RECORDED AT REEL: 059354 FRAME: 0335. ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT;ASSIGNOR:MCAFEE, LLC;REEL/FRAME:060792/0307 Effective date: 20220301 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |