Some people have been asking when they're going to get to see all the great output and data we're generating, so this seemed like a good time to explain where we're at right now. We've realized that while we're busy executing on this plan, maybe other people would like to know more about the behind the scenes also. In order to have automatically generated reports and software scores available at scale, here's what needs to happen:
1. Automate Static Analysis measurement collection - This part is done for Windows, OSX, and Linux/ELF environments (Intel and ARM). It takes ~1 second per binary and we're confident in the accuracy of the results we're collecting.
2. Collect a lot of fuzzing and crash test data - Well underway, but still ongoing. We've got about 100 cores chugging away, and enough results now to be moving on to the next step.
3. Correlate dynamic analysis results (2) with static analysis results (1) to finalize score calculations. This is what we're working on now, and it's the main thing that has to happen before we're happy with releasing reports at scale.
4. (Reach Goal) Gain enough confidence in our mathematical model to successfully predict dynamic results based on static results. This will allow us to present estimated crash test results based on easily automated analysis.
Why is this third step so important? While we know that some things make software safer (ASLR, stack guards, DEP, source fortification, etc) and that some things make software weaker (using historically unsafe functions, high complexity, etc), the industry needs better data on how much they impact software safety. If a perfect score is 100, how many points is having ASLR worth? Linking to insecure libraries certainly introduces risk, but how much should it impact the overall score? We want to have a better answer to questions like these before we publish our first official software safety reports. Having a strong model to support our risk assessments will provide our ratings with the credibility they need in order to influence consumers, developers, the security community, and the commercial world.
In the meantime, here's an overview of the sorts of software properties we're measuring with our static analysis. Also, stay tuned! We're hoping to have some exciting new partnerships and efforts to announce in the coming months.