Test Metric Report

Last modified by Vincent Massol on 2018/05/11 17:38

Context: This report provides information related to testing in the XWiki project for the STAMPS research project that XWiki SAS is participating to. However since this information can be generally useful to everyone, the data is generated openly on this page.

Static Metrics

  • Sources:
    • sonar.xwiki.org
    • For #Test Classes: find . -name "*Test.java" | wc -l and find . -name "*Tests.java" | wc -l in each git repo
    • For #Parametrized Tests: Difference between total number of tests executed - Test methods. If > 0 then there are parametrized tests. To get the total number of tests executed, we use sonar. For ex for Commons http://ci.xwiki.org/job/xwiki-commons/lastSuccessfulBuild/testReport/api/xml, and then totalCount.
    • For #UI Test classes, we use find . -name "*Test.java" -path "*-test/*-tests/*" | wc -l in platform. For Enterprise we just count all files ending with "Test(s).java" since they're all UI tests.
    • Right now XWiki Enterprise is not properly configured on sonar.xwiki.org
    • For Enterprise # Test Methods, we have that info in Clover reports. Date of collection: 2017-02-22
  • Date of data collection: 2016-12-02 (for NCLOC + #Classes + #Test Methods), 2017-01-06 for the rest and for "Enterprise"
RepositoryNCLOC for Java code# Classes# Test Classes# Test Methods# Parametrized Tests# UI Test Classes
Commons54429109721010571078 tests vs 1057 test methods0
Rendering374906231192991610 tests vs 299 test methods0
Platform236808353172333823804 tests vs 3382 test methods69
Enterprise??994288?99
TOTAL3287275251115147386492 tests vs 4738 test methods168

Dynamic Metrics

Statement Coverage

Note: Definition of TPC

Flickering Tests

  • Source: https://jira.xwiki.org with the JQL: labels = flickering and resolution = Unresolved and category = 10000
    • This includes Commons, Rendering, Platform and Enterprise
  • Date of data collection: 2017-02-22
  • Value: 12
  • List of JIRA issues:

Test Execution Times

  • Source: The Clover Report mentions the test execution times
  • Date of data collection: 2017-02-22

Note: Some tests don't really fail, it's actually a limitation of Clover in how it interprets the Test results.

Deployment Time for Manual Testing

Metric: This is a measurement of the amount of time that is needed to set up the software for the purposes of manual testing, this does not include the time required for compiling the source code to binary. 

  • Source: time to install xwiki manually
  • Date of collection: 2017-01-01
  • Time to download the XWiki WAR + time to download a DB (MySQL, PostgreSQL) + time to download a servlet container (Tomcat, etc) + time to set it up = 2 hours
  • Date of collection: 2017-02-22
  • Install Docker once and use the XWiki Docker image (for MySQL + Tomcat) = 5mn

System-specific bugs

  • Results for 2016 (percentage of failing tests): 139 / 4967 * 100 = 2.79%

Crash replicating test cases

  • Source: Approximation: search for jira issues having a stack trace in them. Then find issues having modified Test in them
    • JQL: description ~ '.java:' AND description ~ 'Exception' AND category = 10000 AND createdDate >= 2016-01-01 AND createdDate <= 2016-12-31 and resolution not in ("Cannot Reproduce", Duplicate, Inactive, Incomplete, Invalid, "Won't Fix") ORDER BY created DESC
  • Date of collection: 2016
  • Number of issues with stack traces (before removing the false positives): 25
  • Number of issues with stack traces (after removing the false positives): 17
  • Number of issues with stack traces and tests to prove the fix: 3
  • % test suite = 3 / (total test number) = 3 / 9962 = 0.03%
    • Total test number: 9962
  • Date of collection: 2017
  • Number of issues with stack traces (before removing the false positives): 23
  • Number of issues with stack traces and tests to prove the fix: 5 (XWIKI-14802, XWIKI-14766, XWIKI-14613, XWIKI-14556, XWIKI-14152)
  • % test suite = 3 / (total test number) = 5 / 10433 = 0.04%

what's below is not done yet!

Other Metrics

  • Number of covered public methods: Don't know how to filter this out, maybe could be done with Clover Code contexts
  • Execution time of test suites
  • Mutation score with Pitest

Building a Regression Benchmark

  • R1: (Warmup) Identify 10 Jira tickets about regression
  • R2: Identify more Jira tickets about regression
  • R3: (Warmup) Identify 10 commits which fix a regression
  • R4: Identify more commits fix a regression
  • R5: (Warmup) Identify 10 commits which introduce a regression
  • R6: Identify more commits which introduce a regression

DSpot

  • Compute the percentage of false positives
  • Compute the effectiveness of each amplification operaor
  • Set up extensible architecture to easily support new test case transformation operators
  • Propose new test case transformation operators
  • Contribute to high-quality user documentation
  • Package Dspot as maven plugin usable in CI
  • Package Dspot as Gradle plugin usable in CI
  • Implement live parallel regression testing
  • Store generated tests and their history (define the API): commit version of original test, test method name, see used, amplification operators used 

Dspot + Use cases

  • Apply Dspot on the Commits of R3 to see whether the regressions are caught
  • Apply Dspot on the Commits of R4 to see whether the regressions are caught
  • Apply Dspot on the Commits of R5 to see whether the regressions are caught
  • Apply Dspot on the Commits of R6 to see whether the regressions are caught
Tags:
   

Get Connected