Collecting/combining files after parallel test runs

Collecting Test Results from Parallel Jobs in CircleCI

Overview

When running tests in parallel across multiple containers in CircleCI, each container generates its own test results and artifacts. However, many testing processes, reporting tools, and downstream integrations require all test results to be consolidated into a single file or directory for proper processing.

This guide demonstrates how to use a "collection" job pattern to gather test results from parallel containers using CircleCI workspaces, providing you with consolidated test data that can be processed, analyzed, or passed to subsequent workflow steps.

Common Use Cases

  • Code coverage reporting: Tools like Codecov or Coveralls often need combined coverage data
  • Test reporting dashboards: Consolidating results for comprehensive test analytics
  • Quality gates: Making pass/fail decisions based on complete test suite results
  • Artifact archival: Creating single downloadable test report packages
  • Integration testing: Passing complete test data to deployment or notification systems

The Collection Job Pattern

The collection job pattern involves creating a dedicated job that:

  1. Depends on parallel test jobs using the requires keyword
  2. Attaches workspaces from all parallel containers
  3. Combines or processes the collected test results
  4. Persists the consolidated results for downstream use

Basic Implementation

Step 1: Configure Parallel Test Jobs

First, set up your parallel test jobs to persist their results to workspaces:

version: 2.1

jobs:
  test:
    docker:
      - image: cimg/node:18.17
    parallelism: 4
    steps:
      - checkout
      - run:
          name: Install dependencies
          command: npm ci
      
      - run:
          name: Run tests in parallel
          command: |
            # Split tests across parallel containers
            TESTFILES=$(circleci tests glob "test/**/*.test.js" | circleci tests split --split-by=timings)
            npm test $TESTFILES -- --reporter json --outputFile test-results-${CIRCLE_NODE_INDEX}.json
      
      - run:
          name: Generate coverage report
          command: |
            npm run coverage -- --outputFile coverage-${CIRCLE_NODE_INDEX}.json
      
      # Persist test results and coverage from this container
      - persist_to_workspace:
          root: .
          paths:
            - test-results-*.json
            - coverage-*.json
            - node_modules  # If needed for collection job

Step 2: Create the Collection Job

  collect_test_results:
    docker:
      - image: cimg/node:18.17
    steps:
      - checkout
      
      # Attach workspace containing results from all parallel containers
      - attach_workspace:
          at: .
      
      - run:
          name: Combine test results
          command: |
            # Create output directory
            mkdir -p combined-results
            
            # Merge all JSON test results into a single file
            echo "[]" > combined-results/all-tests.json
            for file in test-results-*.json; do
              if [ -f "$file" ]; then
                # Merge JSON arrays (simplified - you may need more robust merging)
                jq -s 'add' combined-results/all-tests.json "$file" > temp.json
                mv temp.json combined-results/all-tests.json
              fi
            done
      
      - run:
          name: Combine coverage reports
          command: |
            # Merge coverage data
            npx nyc merge . combined-results/coverage.json
            
            # Generate final coverage report
            npx nyc report --reporter=html --report-dir=combined-results/coverage-html
            npx nyc report --reporter=lcov --report-dir=combined-results
      
      - run:
          name: Generate summary report
          command: |
            # Create a summary of all test results
            node -e "
              const fs = require('fs');
              const testData = JSON.parse(fs.readFileSync('combined-results/all-tests.json'));
              const summary = {
                totalTests: testData.length,
                passed: testData.filter(t => t.status === 'passed').length,
                failed: testData.filter(t => t.status === 'failed').length,
                timestamp: new Date().toISOString()
              };
              fs.writeFileSync('combined-results/summary.json', JSON.stringify(summary, null, 2));
              console.log('Test Summary:', summary);
            "
      
      # Store combined results as artifacts
      - store_artifacts:
          path: combined-results
          destination: test-reports
      
      # Persist combined results for downstream jobs
      - persist_to_workspace:
          root: .
          paths:
            - combined-results

Step 3: Configure the Workflow

workflows:
  test_and_collect:
    jobs:
      - test
      - collect_test_results:
          requires:
            - test
      
      # Optional: downstream job that uses collected results
      - deploy:
          requires:
            - collect_test_results
          filters:
            branches:
              only: main

Advanced Examples

Example 1: Python with pytest and Coverage

jobs:
  test_python:
    docker:
      - image: cimg/python:3.9
    parallelism: 3
    steps:
      - checkout
      - run: pip install -r requirements.txt
      
      - run:
          name: Run parallel tests
          command: |
            # Split test files
            TESTFILES=$(circleci tests glob "tests/test_*.py" | circleci tests split --split-by=timings)
            
            # Run tests with coverage and JUnit XML output
            python -m pytest $TESTFILES \
              --junitxml=test-results-${CIRCLE_NODE_INDEX}.xml \
              --cov=src \
              --cov-report=xml:coverage-${CIRCLE_NODE_INDEX}.xml \
              --cov-report=json:coverage-${CIRCLE_NODE_INDEX}.json
      
      - persist_to_workspace:
          root: .
          paths:
            - test-results-*.xml
            - coverage-*.xml
            - coverage-*.json

  collect_python_results:
    docker:
      - image: cimg/python:3.9
    steps:
      - checkout
      - attach_workspace:
          at: .
      
      - run:
          name: Install coverage tools
          command: pip install coverage[toml] junitparser
      
      - run:
          name: Combine coverage data
          command: |
            # Combine coverage files
            python -m coverage combine coverage-*.json
            python -m coverage xml -o combined-results/coverage.xml
            python -m coverage html -d combined-results/coverage-html
            python -m coverage report > combined-results/coverage-report.txt
      
      - run:
          name: Merge JUnit XML files
          command: |
            python -c "
            import xml.etree.ElementTree as ET
            import glob
            import os
            
            # Create combined results directory
            os.makedirs('combined-results', exist_ok=True)
            
            # Parse all XML files and combine
            combined = ET.Element('testsuites')
            
            for xml_file in glob.glob('test-results-*.xml'):
                tree = ET.parse(xml_file)
                root = tree.getroot()
                if root.tag == 'testsuite':
                    combined.append(root)
                elif root.tag == 'testsuites':
                    for suite in root:
                        combined.append(suite)
            
            # Write combined XML
            tree = ET.ElementTree(combined)
            tree.write('combined-results/junit.xml', encoding='utf-8', xml_declaration=True)
            "
      
      - store_test_results:
          path: combined-results
      
      - store_artifacts:
          path: combined-results
          destination: python-test-reports

Example 2: Multi-Language Project with Different Test Types

jobs:
  test_frontend:
    docker:
      - image: cimg/node:18.17
    parallelism: 2
    steps:
      - checkout
      - run: cd frontend && npm ci
      - run:
          name: Run frontend tests
          command: |
            cd frontend
            TESTFILES=$(circleci tests glob "src/**/*.test.js" | circleci tests split)
            npm test -- $TESTFILES --coverage --testResultsProcessor=jest-junit
            mv junit.xml ../frontend-results-${CIRCLE_NODE_INDEX}.xml
            mv coverage ../frontend-coverage-${CIRCLE_NODE_INDEX}
      
      - persist_to_workspace:
          root: .
          paths:
            - frontend-results-*.xml
            - frontend-coverage-*

  test_backend:
    docker:
      - image: cimg/go:1.20
    parallelism: 2
    steps:
      - checkout
      - run:
          name: Run backend tests
          command: |
            PACKAGES=$(go list ./... | circleci tests split)
            gotestsum --junitfile backend-results-${CIRCLE_NODE_INDEX}.xml \
              --format testname -- -coverprofile=backend-coverage-${CIRCLE_NODE_INDEX}.out \
              $PACKAGES
      
      - persist_to_workspace:
          root: .
          paths:
            - backend-results-*.xml
            - backend-coverage-*.out

  collect_all_results:
    docker:
      - image: cimg/base:stable
    steps:
      - checkout
      - attach_workspace:
          at: .
      
      # Install necessary tools
      - run:
          name: Install tools
          command: |
            # Install Node.js for frontend processing
            curl -fsSL https://deb.nodesource.com/setup_18.x | sudo -E bash -
            sudo apt-get install -y nodejs
            
            # Install Go for backend processing
            wget https://go.dev/dl/go1.20.linux-amd64.tar.gz
            sudo tar -C /usr/local -xzf go1.20.linux-amd64.tar.gz
            export PATH=$PATH:/usr/local/go/bin
      
      - run:
          name: Process all test results
          command: |
            mkdir -p combined-results/{frontend,backend}
            
            # Combine frontend results
            cat frontend-results-*.xml > combined-results/frontend/junit.xml
            
            # Merge frontend coverage
            cd frontend
            npx nyc merge ../frontend-coverage-* ../combined-results/frontend/coverage.json
            npx nyc report --reporter=lcov --report-dir=../combined-results/frontend
            cd ..
            
            # Combine backend results
            cat backend-results-*.xml > combined-results/backend/junit.xml
            
            # Merge backend coverage
            echo "mode: set" > combined-results/backend/coverage.out
            grep -h -v "^mode:" backend-coverage-*.out >> combined-results/backend/coverage.out
      
      - run:
          name: Generate project summary
          command: |
            # Create overall project test summary
            node -e "
              const fs = require('fs');
              const path = require('path');
              
              const summary = {
                timestamp: new Date().toISOString(),
                frontend: {},
                backend: {},
                overall: {}
              };
              
              // Parse frontend results (simplified)
              // Parse backend results (simplified)
              // Calculate overall metrics
              
              fs.writeFileSync('combined-results/project-summary.json', 
                JSON.stringify(summary, null, 2));
            "
      
      - store_test_results:
          path: combined-results
      
      - store_artifacts:
          path: combined-results
          destination: project-test-reports
      
      - persist_to_workspace:
          root: .
          paths:
            - combined-results

workflows:
  comprehensive_testing:
    jobs:
      - test_frontend
      - test_backend
      - collect_all_results:
          requires:
            - test_frontend
            - test_backend
      
      - deploy:
          requires:
            - collect_all_results
          filters:
            branches:
              only: main

Best Practices

1. Workspace Organization

Structure your workspace persistence thoughtfully:

- persist_to_workspace:
    root: .
    paths:
      - test-results/container-${CIRCLE_NODE_INDEX}  # Organized by container
      - coverage/container-${CIRCLE_NODE_INDEX}
      - logs/container-${CIRCLE_NODE_INDEX}

2. Error Handling

Make your collection job robust with proper error handling:

- run:
    name: Combine results with error handling
    command: |
      set -e  # Exit on error
      
      mkdir -p combined-results
      
      # Check if any test result files exist
      if ls test-results-*.json >/dev/null 2>&1; then
        echo "Found test result files, combining..."
        # Combine logic here
      else
        echo "Warning: No test result files found"
        echo '{"error": "No test results found", "timestamp": "'$(date -Iseconds)'"}' > combined-results/error.json
      fi
      
      # Always create a summary, even if partial
      echo "Creating summary..."
      # Summary logic here

3. Memory and Storage Considerations

Be mindful of workspace size and collection job resources:

collect_test_results:
  docker:
    - image: cimg/node:18.17
  resource_class: medium  # Increase if processing large files
  steps:
    # ... other steps
    
    - run:
        name: Clean up large intermediate files
        command: |
          # Remove large source files after processing
          rm -rf node_modules
          rm -f *.log
          
          # Keep only essential combined results

4. Conditional Processing

Handle scenarios where some parallel jobs might fail:

- run:
    name: Conditional result processing
    command: |
      total_containers=${CIRCLE_NODE_TOTAL}
      found_results=0
      
      for i in $(seq 0 $((total_containers-1))); do
        if [ -f "test-results-${i}.json" ]; then
          found_results=$((found_results + 1))
        else
          echo "Warning: Missing results from container ${i}"
        fi
      done
      
      echo "Found results from ${found_results}/${total_containers} containers"
      
      if [ $found_results -eq 0 ]; then
        echo "Error: No test results found from any container"
        exit 1
      fi

Integration Examples

Codecov Integration

- run:
    name: Upload to Codecov
    command: |
      # Upload combined coverage report
      bash <(curl -s https://codecov.io/bash) -f combined-results/lcov.info

Slack Notifications

- run:
    name: Send test summary to Slack
    command: |
      SUMMARY=$(cat combined-results/summary.json)
      curl -X POST -H 'Content-type: application/json' \
        --data "{\"text\":\"Test Results: $SUMMARY\"}" \
        $SLACK_WEBHOOK_URL

Quality Gates

- run:
    name: Quality gate check
    command: |
      COVERAGE=$(jq '.coverage.percentage' combined-results/summary.json)
      FAILURES=$(jq '.failed' combined-results/summary.json)
      
      if (( $(echo "$COVERAGE < 80" | bc -l) )) || [ "$FAILURES" -gt 0 ]; then
        echo "Quality gate failed: Coverage: $COVERAGE%, Failures: $FAILURES"
        exit 1
      fi
      
      echo "Quality gate passed!"

Troubleshooting

Common Issues

  1. Missing workspace data: Ensure all parallel jobs use persist_to_workspace
  2. File path conflicts: Use unique file names with ${CIRCLE_NODE_INDEX}
  3. Large workspace sizes: Clean up unnecessary files before persisting
  4. JSON merging errors: Validate JSON format in parallel jobs
  5. Permission issues: Ensure consistent user permissions across jobs

Debugging Tips

- run:
    name: Debug workspace contents
    command: |
      echo "Current directory contents:"
      find . -name "test-results-*" -o -name "coverage-*"
      
      echo "Workspace size:"
      du -sh .
      
      echo "Available disk space:"
      df -h

Conclusion

The collection job pattern is a powerful technique for consolidating test results from parallel CircleCI jobs. By properly implementing workspace sharing and result aggregation, you can maintain the performance benefits of parallel testing while meeting the requirements of downstream tools and processes that need unified test data.

This approach scales well with your testing needs and provides a clean separation of concerns between test execution and result processing, making your CI/CD pipeline more maintainable and reliable.

 

Was this article helpful?
6 out of 92 found this helpful

Comments

0 comments

Article is closed for comments.