The dataset "RC_Data" was created by Lori Flynn, Aubrie Woods, Ebonie McNeil, and Matt Sisk.
It can be downloaded here: RC_Data.zip (publication date: April 2, 2020)
This data was generated as part of a series of research projects led by Lori Flynn into automated classification of static analysis alerts (warnings) and meta-alerts (alerts mapped to code flaws, a.k.a. conditions).
We are publishing this data to enable others to test algorithms and tools we developed, and also to support external research on automated classification.
The RC_Data file can be downloaded and then reconstituted into a mongo database. The database contains data for two test suites: the Juliet Java Test Suite and for the Juliet C/C++ Test Suite. The Juliet test suites are open-source (created by NSA CAS, hosted by the NIST SARD website) and were created for testing the quality of static analysis flaw-finding tools. We use them in a different way than their original design. We use the test suites to help generate data for creating and testing automated classification tools. The RC-Data dataset includes structured data about the flaw-finding static analysis alerts from open-source tools, information about conditions (CWEs) those are mapped to, verdicts (true/false/unknown) determined using test suite meta-data, code metrics from open-source code metrics tools.
In the future, we will add more data to augmented versions of this dataset hosted here. They will include open-source data from more codebases (test suites and not), data from more tools, and we will add more features to the dataset.
How to use the downloaded file:
Unzip the .zip file, using your favorite tool. That leaves you with the license file license.txt
and the database file ophelia_merge_oss_c_java.gz
To restore to a mongo database, the following instructions work in a bash terminal in Linux:
1. Extract the compressed file: gunzip
ophelia_merge_oss_c_java.gz
2. mongorestore --host localhost:27017 --archive=<YOUR_FILEPATH_HERE>/ophelia_merge_oss_c_java
In the above command, replace <YOUR_FILEPATH_HERE>
with the filepath on your own machine.
Then, you can inspect the database (e.g., using the mongo
application from a bash terminal)