Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: made version number consistent in the "how to use" text on wiki page

The dataset "RC_Data" was created by Lori Flynn, Aubrie Woods, Ebonie McNeil, and Matt Sisk. 

It can be downloaded here: RC_Data_v3.zip (First publication date: April 2, 2020. Publication dates of later versions listed at bottom.)

This data was generated as part of a series of research projects led by Lori Flynn into automated classification of static analysis alerts (warnings) and meta-alerts (alerts mapped to code flaws, a.k.a. conditions).

...

The RC_Data file can be downloaded and then reconstituted into a mongo database. The database contains data for two test suites: the Juliet Java Test Suite and for the Juliet C/C++ Test Suite. The Juliet test suites are open-source (created by NSA CAS, hosted by the NIST SARD website) and were created for testing the quality of static analysis flaw-finding tools. We use them in a different way than their original design. We use the test suites to help generate data for creating and testing automated classification tools. The RC-_Data dataset includes structured data about the flaw-finding static analysis alerts from open-source tools, information about conditions (CWEs) those are mapped to, verdicts (true/false/unknown) determined using test suite meta-data, code metrics from open-source code metrics tools.

...

How to use the downloaded file (for example below, version number is "v3"):

Unzip the .zip file, using your favorite tool. That leaves you with the license file license.txt and the database file opheliaRC_merge_oss_c_java.gz Data_v3.gz 

To restore to a mongo database, the following instructions work in a bash terminal in Linux:

1. Extract the compressed file: gunzip opheliagunzip RC_merge_oss_c_javaData_v3.gz

2. mongorestore --host localhost:27017 --archive=<YOUR_FILEPATH_HERE>/opheliaRC_merge_oss_c_javaData_v3

In the above command, replace <YOUR_FILEPATH_HERE> with the filepath on your own machine.

Then, you can inspect the database (e.g., using the mongo application from a bash terminal)


Publication dates, notes:

  • RC_Data (version 1): April 2, 2020. Note: first publication.
  • RC_Data_v2: April 27, 2020. Note: updated checker mappings.
  • RC_Data_v3: March 11, 2021. Note: updated README, per Schiela.