You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

The dataset "RC_Data" was created by Lori Flynn, Aubrie Woods, Ebonie McNeil, and Matt Sisk.

It can be downloaded here: RC_Data

This data was generated as part of a series of research projects led by Lori Flynn into automated classification of static analysis alerts (warnings) and meta-alerts (alerts mapped to code flaws, a.k.a. conditions).

We are publishing this data to enable others to test algorithms and tools we developed, and also to support other research on automated classification.

The RC_Data file can be downloaded and then reconstituted into a mongo database. The database contains data for two test suites: the Juliet Java Test Suite and for the Juliet C/C++ Test Suite. Both test suites are open-source (created by NSA CAS, hosted by the NIST SARD website) and the RC-Data dataset includes structured data about the flaw-finding static analysis alerts from open-source tools, information about conditions (CWEs) those are mapped to, verdicts (true/false/unknown) determined using test suite meta-data, code metrics from open-source code metrics tools.

In the future, we will add more data to augmented versions of this dataset hosted here. They will include open-source data from more codebases (test suites and not), data from more tools, and we will add more features to the dataset. 

  • No labels