When performing Web Application Scanning with Qualys WAS, you may experience long scan times or a Time Limit Reached status triggered by QID 150024 - Scan Time Limit Reached. To improve scan times in those situations, or simply to gain efficiency, various settings, in both the option profile and the web application settings can be utilized to improve performance. This guide will highlight those settings and the QIDs from Scan Diagnostics which help guide configuration changes.
Scanning Option Profile Settings to Improve Scan Time
Form Crawl Scope - You have the option to include the form action URI in form uniqueness calculation. The issue here is that if the same form is on multiple pages (e.g. a search bar), the scan engine by default will identify it as being the same form and only test it once. However, once the form action URI is included as part of the calculation for form uniqueness, the scanner will test the same form on every page it exists. This is unnecessary in most cases and can potentially add significant time to a scan. It is suggested to not included the form action.
Maximum links to test in scope - Set to the maximum of 8000 links.
SmartScan Depth - This setting can significantly affect scan times. With dynamic applications, the higher the SmartScan Depth, the more different AJAX/SPA endpoints the scanner will attempt to generate. Often customers will set this to a high number thinking that it will improve their scan without really understanding the feature or their application. The default is 5, but it is generally recommend setting this to 2, and once you have every other aspect of the scan under control you can certainly increase the SmartScan Depth as necessary.
As dynamic applications are highly depended on more advanced frameworks, most vulnerabilities will be identified through the libraries you are using and reported separately. SmartScan allows us to interact with the site and identify additional URLs and targets for testing, so a low SmartScan Depth should not adversely impact the thoroughness of the testing methodology.
Timeout Error Threshold - The default setting is 100 and keeping this enabled is recommended. If many timeout errors occur while scanning an application, this will adversely affect the total scan time. It is better to have the scan end in a timeout error to address correcting the problem with the developers and server admins instead of artificially inflating the error threshold or disabling it completely.
Unexpected Error Threshold - The default setting is 300 and keeping this enabled is recommended. If many unexpected errors occur while scanning an application, this will adversely affect the total scan time. It is better to have the scan end in a service error to address correcting the problem with the developers and server admins instead of artificially inflating the error threshold or disabling it completely.
Scan Intensity – At the low end, the scanning engine will have 1 http thread interacting with the application and will have a delay between each server response and the next request of 2 seconds. At the maximum intensity setting there are 10 http threads interacting with the application. The goal is to make scans completely safe for production environments, and as such they are slow by design. If you notice the response times increasing throughout the scan or if multiple timeouts are observed, the scan intensity may need to be lowered.
Bruteforcing Settings - Disable this feature as it is time intensive and not practical for most instances.
Web Application Configuration Settings to Improve Scan Time
Scan Settings - Duration - Set the cancel option to "Do not Cancel Scan" if possible to give the scan the most time to completely test the application. Please note - all scans will end after 24 hours regardless of whether you set this to "Do not Cancel Scan" or not.
Redundant Links - This feature can be useful to omit testing of identical source code on pages matching regular expressions. Be sure to use the PCRE formatting rules for any regular expressions. A great resource to test out regular expressions can be found here.
A few useful things to keep in mind: if you are using complete or partial URLs including the "http://" prefix, you can include both http and https by using http(s*) - This allows the engine to match on http or https. In many instances this can be avoided by setting up a regular expression to match on something unique in the URLs themselves. For instance, if there is a collection of press releases and you have identified them as being good candidates for redundant links, you can use wildcards instead of specifying complete URLs.
As an example, consider http://www.example.com/press/releases/2019/December/... To include both the http and https version of ALL press releases regardless of year or month of release, a valid regular expression would be: .*\/press\/releases.* (note the escape character '\' in front of the backslashes '/').
Sometimes customers will complain that even after adding a redundant link, WAS is still crawling and testing URLs that should be omitted. This usually happens as a result of running a vulnerability scan before setting up valid redundant link regular expressions. If a vulnerability scan identifies any vulnerability on any URL, the scanner will ALWAYS attempt to retest them regardless of any redundant link. The only way to remove this prior vulnerability information is to purge the application, and then recreate it with the correct redundant link rules in place. Therefore, it is important to run discovery scans and ensure you are crawling only the links you desire before launching a vulnerability scan.
Also, changes cannot be applied to ongoing progressive scans. They will continue with the previous configuration. If changes are made, the progression needs to be restarted. Please note that any explicit URLs to crawl under the application details section of the web application configuration will also supersede the redundant link rules.
Exclusions - Global Settings - By default this is enabled and could conflict with application configuration exclusions for white lists or black lists. This would essentially break whatever rules you are trying to set in place. If you are setting application level white lists or black lists, be sure to disable the global settings or at least review them to ensure they are not contradictory to the application settings. A common global exclusion could be \.(jpg|img|gif|png|ico|gz|svg|woff|woff2|ttf|jpeg|tif|mp4).*$. This would remove all links with those extensions from the links crawled.
Exclusions - White List – The scanner will only crawl the links explicitly provided as URLs or those that match the regular expressions you provide. This is not the same as providing explicit URLs to crawl under the application details section. The scanner will not just "jump" to the whitelisted URLs to crawl – it must follow links to reach them. If any linkage in that chain is not included in the white list rules, the crawl may not reach all the links you expect and sections of the application will not be reached. This can be corrected by providing explicit URLs to crawl under the application details section as these URLs supersede ANY redundant link or exclusion rules you set up. Please also note that setting anything under white listing will essentially black list all other links provided you have not added them explicitly as mentioned previously.
Exclusions - Black List - We will only skip the links explicitly provided as URLs or those that match the regular expressions you provide. However, any URLs added to the application details section as explicit links to crawl will supersede ANY redundant link or exclusion rules you set up. Please also note that setting anything under black listing will essentially white list all other links. For this reason, it is important not to set up rules under both black listing and white listing as they may create a conflict with each other and result in erratic behavior.
Exclusions - Parameters - You can add exclusions for very specific items to omit from the web application scan. The most common and potentially useful use case here for decreasing scan time is to apply an exclusion to all non-session related cookies. Some sites may issue dozens or hundreds of cookies that the scanner will test otherwise. Also, all cookies are tested against all links so this can quickly create overly long scan times.
For more details on Exclusions, please see: https://discussions.qualys.com/docs/DOC-7075-web-application-scanning-controlling-links-crawled-with-explicit-urls-redundant-links-black-lists-and-white-lists
Scan Report QIDs to Review to Improve Scan Time
Review you scan report for the following QIDs to help identify areas where scans are consuming the most time.
150009 - Links Crawled - Reviewing this list should help identify potential candidates for the redundant link feature and blacklists discussed above.
150018 - Connection Error Occurred During Web Application Scan - Work with server admins to figure out why connection errors are occurring to decrease overall scan time.
150021 - Scan Diagnostics - A lot of information is found here. Examples include the average server response time:
Total requests made: 8432
Average server response time: 0.01 seconds
The goal is for a response time under 1 second. Long response times greatly increase the time to complete the scan. If the scan is being run at a high intensity, this could be related and lowering the scan intensity may be required. An average response time of 3 seconds or greater will result in a long scan time without question. It is advised to work with the application and server teams to improve response times.
Batch testing information - how long did each batch of tests take? The diagnostics may show hours spent on forms or cookie manipulation.
Batch #4 Cookie manipulation: estimated time < 212 hours (39 tests, 8 inputs)
Batch #4 Cookie manipulation: 39 vulnsigs tests, completed 250000 requests, 79211 seconds.
Completed 250000 requests of 2378792 estimated requests (10.5095%). Module did not finish.
Once you see something unusual you can take steps to improve it. In most cases, this QID quickly reveals where the vast majority of time testing is spent. In this example, using the parameter exclusion option to reduce the number of cookies will improve scan times.
150026 - Maximum Number of Links Reached During Crawl - Ensure you have set max links to crawl to 8000. We may still reach the maximum, but at least this way with progressive scanning you will more efficiently crawl and test all the links in the application.
150028 - Cookies Collected – Is the scan detecting a lot of cookies? How long are is the scan spending testing these cookies in the scan diagnostics? Use parameter exclusion if necessary.
150097 - HTTP Response Indicates Scan May Be Blocked. If requests are being blocked, this will impact the efficiency of scans and increase the scan time. Ensure you are whitelisting Qualys in any WAF, IDS, IPS, firewalls, etc.
150100 – Selenium Diagnostics. If Selenium authentication fails, review this QID to determine what step is failing. Utilize the QBR User Guide to ensure you are following best practices.
150140 - Redundant links/URL paths crawled and not crawled - If you provide any rules under the redundant link section, here is where you can review and validate them. For each rule, the scan report should show how many were matched and crawled as well as how many were matched and omitted from testing. Remember any prior vulnerability found at one of these URLs will supersede the rules you specify under the redundant links section.
150148 - AJAX Links Crawled - This is completely controlled by the SmartScan Depth you selected as discussed above. The greater the depth, the more dynamic links are created even if they do not alter the endpoint. Each of these AJAX links can be executed to see what is happening by copying and pasting them into the QBR recorder since they are Selenium scripts. In this way you can validate the AJAX links created, but as this is a manual process. It is recommended to start with a depth of 2 and moving it up if the scan time is within reason.
150152 - Forms Crawled - We will report on duplicate forms crawled and tested and make a recommendation on disabling the form action URI in calculating uniqueness as discussed above. This is a great way to see just how many duplicates you may be unnecessarily testing. This is also help to identify the forms that may create issues with email submissions.
While this list is not complete, and every environment is different, these recommendations should help you to improve not only scan efficiency but also decrease the total scan time. There are, of course, some situations or some applications where no amount of configuration changes can reduce a scan taking 24 hours to a small 4-hour test window. In those cases, progressive scanning is important to ensure complete application coverage and testing. Following any or all of the above recommendations should have a positive impact on performance.
A Note about Progressive Scanning:
In the event you have short scan windows, greater than 8000 links, or a scan takes more than 24 hours to complete, progressive scanning will enable the continuation of a scan. While this does not decrease the overall scan time, it will ensure the application is completely tested over a series of progressive scans, if necessary. The vast majority of applications do not have 8000 valid unique links and can be tuned to fit inside the link limit.