Qualys API Best Practices: Host List Detection API
Host List Detection is your subscription’s list of hosts and their corresponding up-to-date detections including:
- Confirmed Vulnerability Detections
- Potential Vulnerability Detections
- Information Gathered Detections about your system
After extracting Host List Detection vulnerability data from Qualys, you’ll be able to create custom reporting, perform ad-hoc vulnerability analysis or distribute the vulnerability state of your systems to a central data store. To help achieve this, we are bringing together KnowledgeBase API and Host List API to demonstrate how they work together with Host List Detection API. These three Vulnerability Management (VM) APIs are brought together to provide a rich set of vulnerability information, including:
- Host List asset information about when each of your assets were last scanned
- KnowledgeBase information that includes detailed vulnerability knowledge, Ex. CVSS (Common Vulnerability Scoring System) scores
- Host List Detection information including specific vulnerability detection details that will help you better understand the vulnerabilities on your system, and the solutions to remediate them
In Part 3 of this series our goal is to combine the data from Host List, KnowledgeBase, and Host List Detection into the latest, timestamped, point-in-time SQLite database. To help customers realize this goal, we are providing a blueprint of example code called QualysETL that is open-sourced for your organization to develop with. In the diagram below, QualysETL is depicted as a workflow from which you can use the resulting SQLite database for analysis on your desktop, or as part of a continuous live data feed to update your corporate data store in the cloud or your local data center. The accompanying video presents QualysETL in more detail, along with live examples to help you effectively Extract, Transform, Load and Distribute Qualys Data.
The Qualys API is a key component in the API-First model. From the beginning of Qualys in 1999, a rich set of Qualys APIs have been available and continue to improve. As a result, customers have been able to automate processing Qualys in new ways, increasing their return on investment (ROI), and improving overall mean time to remediate (MTTR) vulnerabilities throughout the enterprise.
Even with all these advances in API, some customers continue to experience suboptimal performance in various areas such as automation. So, what are the inherent automation challenges to ETL or Extract, Transform and Load your Qualys Data? And what do we mean by ETL?
- Extract refers to extracting Qualys Vulnerability Data using Qualys APIs.
- Transform refers to reading the resulting extracted vulnerability data from Qualys and transforming or enhancing it into other forms/formats that your organization decides will be useful, for example CSV (Comma Separated Value) or JSON. The transform step is also an opportunity to enhance the data, for example injecting security intelligence specific to your organization that will help drive remediation. Ex. Application Ownership Information, Infrastructure Patching Team Name.
- Load refers to loading the data into its final form on disk for independent analysis ( Ex. SQLite ) or distributing Qualys data to its destination in the cloud.
To help customers with ETL, we are providing a reusable blueprint of live example code called QualysETL. You can reuse and customize QualysETL example code to suit your organization’s needs. In the accompanying video presentation, we will demonstrate installation and operation of the QualysETL software within a Python Virtual Environment on an Ubuntu 20.04 VM.
At the end of this Qualys Host List Detection API blog post and video, you will gain experience in the areas of development, design, and performance with the Qualys API including:
- Applying a simple ETL design pattern to the Host List Detection API.
- Installation and Operation of QualysETL.
- QualysETL transformation of Host List Detection XML into Python Shelve Dictionary, JSON, CSV and SQLite Database.
In the next part of this series, we’ll add CyberSecurity Asset Management API (formerly known as Global IT Asset Inventory) so you can add a deeper asset inventory correlation of your systems with vulnerability data, including software inventory, end of life, cloud provider information, tagging and other metadata you’ll use to enhance the overall security view of your systems.
Definition of Host List Detection
Qualys Host List Detection: Your subscriptions list of hosts and corresponding up-to-date detections including 1) Confirmed Vulnerabilities, 2) Potential Vulnerabilities and 3) Information Gathered about your system.
- TYPE – Includes Confirmed Vuln, Potential Vuln, and Info Gathered.
- QID (Qualys Identifier) – correlates between the KnowledgeBase and Host List Detection.
- Host ID – correlates between Host List and Host List Detection.
With any API, there are inherent automation challenges. Some of those automation challenges for Host List Detection are:
- What are the best practice programming methods to extract Host List Detections from the Qualys API reliably, efficiently?
- How to obtain all the Host List Detection XML output which provides detailed detection reporting of Confirmed, Potential and Information Gathered Detections.
- How to integrate Qualys data into a customer’s database for reuse in automation.
You will want to transform XML data into a format suitable for storage or future correlations with other corporate data sources. The QualysETL blueprint of example code can help you with that objective.
ETL Design Pattern
The ETL Design Pattern or Extract, Transform and Load design pattern is a wonderful place to start when transforming Qualys API data into a form/format that is appropriate for your organization. In the diagram, you see depicted the generalized ETL cycle for, the KnowledgeBase which includes rich details related to each vulnerability, the Host List, which is the programmatic driver using Host IDs and VM_Processed_After Date to ETL Host List Detection.
These ETLs are encapsulated in the example blueprint code QualysETL. QualysETL is a fantastic way to get started with your extract, transform and load objectives.
Host List Detection Activity Diagram
The Host List Detection Activity Diagrams key point is to depict the three types of ETLs, operating simultaneously, resulting in an ETL of all three types of data, Host List, KnowledgeBase, and Host List Detection.
In the diagram you see the ETL of Knowledgebase, operating simultaneously next to the ETL of Host List, which is the programmatic driver for, the ETL of Host List Detection. Near the center of the Activity Diagram, you can see the prepare HostID queue. This is the list of HostIDs that drive the downloading of Host List Detection via spawning of concurrently running jobs through a multiprocessing facility. The result will be CSV, JSON and SQLite which includes the relevant KnowledgeBase, Host List and Host List Detection tables. These data are being stored in both their independent data locations as well as combined into one SQLite database instance that can be used as the most recent view of your vulnerability data. You can distribute snapshots of your ETL data for desktop analysis or as a pipeline of continues updates in your corporate data store.
- Use Host List ETL to drive Host List Detection Extract, scoping the extract to brief time intervals via vm_processed_after date.
- Include incremental KnowledgeBase after Host List Detection Extract is completed.
- Distribute snapshots of your ETL data for desktop analysis or as a pipeline of continues updates in your organization’s data store.
What is QualysETL?
QualysETL is a blueprint of example code written in python that can be used by your organization as a starting point to develop your companies ETL automation. Below you see the QualysETL Workflow which includes:
- Extract – The API Calls to Qualys.
- Transform – The methods which transform, enhance, cleanse, and catch errors in data.
- Load – The methods which produce various forms of data such as JSON, CSV and SQLite3.
- Distribution – The methods that can be devised to make data usable by your organization.
One example of distribution would be for your organization to develop a method of uploading a timestamped version of SQLite into an AWS (Amazon Web Services) Relational Database Service or distribute to an AWS S3 Bucket. Another example of distribution would be to ensure the SQLite database is available via a local share on your network where analysts can process and report on vulnerabilities in your organization using their desktop tool of choice.
To install QualysETL, we recommend you spin up a secure virtual machine instance of Ubuntu 20.04 that has connectivity to the internet. When that step is completed, you can login to your Ubuntu instance and work along with me in the accompanying video to install the application and run your first ETL. The instructions are located on Pypi.org at:
Once you have worked along with me in the accompanying video, you can run your own SQL queries to analyze the data and tune the application to meet your needs.
QualysETL Key Takeaways
- With one command, you can ETL Host List Detection into a current SQLite Database, ready for analysis or distribution.
- QualysETL is blueprint example code you can extend or use as you need. It is open source, distributed under the Apache 2 license.
Put it into Practice!
Going forward, here are some final key tips:
- Leverage QualysETL as a blueprint of example code to produce a current Host List Detection SQLite Database, ready for analysis or distribution.
- For questions, schedule time through your TAM (Technical Account Manager) to meet with our solutions architects, we are here to help.
About This Series
The Qualys API Best Practices Technical Series is designed for stakeholders or programmers with general knowledge of programming who want to implement best practices to improve development, design, and performance of their programs that use the Qualys API.
The API Best Practices Series will expand over the coming months to cover other key aspects of the Qualys API, with each presentation building on the previous one and in aggregate providing an overall best practice view of the Qualys API. The next presentations in the series will focus on CyberSecurity Asset Management (CSAM) API formerly known as Global IT Asset Inventory API.
This post was first first published on Qualys Security Blog’ website by David Gregory. You can view it by clicking here