Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,7 @@ jobs:
run: CI_ENV=1 CI_MINIMIZE_DISK_USAGE=1 ./ci/ci-tx-sync-tests.sh

coverage:
needs: fuzz
strategy:
fail-fast: false
runs-on: self-hosted
Expand All @@ -133,6 +134,11 @@ jobs:
# Maybe if codecov wasn't broken we wouldn't need to do this...
./codecov --verbose upload-process --disable-search --fail-on-error -f target/codecov.json -t "f421b687-4dc2-4387-ac3d-dc3b2528af57" -F 'tests'
cargo clean
- name: Download honggfuzz corpus
uses: actions/download-artifact@v4
with:
name: hfuzz-corpus
path: fuzz/hfuzz_workspace
- name: Run fuzz coverage generation
run: |
./contrib/generate_fuzz_coverage.sh --output-dir `pwd` --output-codecov-json
Expand Down Expand Up @@ -261,13 +267,41 @@ jobs:
- name: Install Rust ${{ env.TOOLCHAIN }} toolchain
run: |
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --profile=minimal --default-toolchain ${{ env.TOOLCHAIN }}
# This is read-only for PRs. It seeds the fuzzer for a more effective run.
# NOTE: The `key` is unique and will always miss, forcing a fallback to
# the `restore-keys` to find the latest global cache from the `main` branch.
- name: Restore persistent fuzz corpus (PR)
if: ${{ github.ref != 'refs/heads/main' }}
uses: actions/cache/restore@v4
with:
path: fuzz/hfuzz_workspace
key: fuzz-corpus-${{ github.ref }}-${{ github.sha }}
restore-keys: |
fuzz-corpus-refs/heads/main-
# The `restore-keys` performs a prefix search to find the most recent
# cache from a previous `main` run. We then save with a new, unique
# `key` (using the SHA) to ensure the cache is always updated,
# as caches are immutable.
- name: Restore/Save persistent honggfuzz corpus (Main)
if: ${{ github.ref == 'refs/heads/main' }}
uses: actions/cache@v4
with:
path: fuzz/hfuzz_workspace
key: fuzz-corpus-refs/heads/main-${{ github.sha }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we save to fuzz-corpus-refs/heads/main-? this includes the sha.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we save to fuzz-corpus-refs/heads/main-?

No, the key isn't a save location but an identifier used to save and search for a cache.

this includes the sha.

Yes. Because we can't mutate an already existing cache, but still need to ensure new updated corpus are cached. The sha ensures the cache is unique.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the key isn't a save location but an identifier used to save and search for a cache.

Wait, this statement contradicted itself?

Yes. Because we can't mutate an already existing cache, but still need to ensure new updated corpus are cached. The sha ensures the cache is unique.

But how does the read end know where to look for it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, this statement contradicted itself?

Right, generally, the key is used to save and search. But the idea here is to rely on the key to save the updated corpus uniquely (since we can't mutate an already existing cache) and use restore-keys to search.

But how does the read end know where to look for it?

The restore-keys does a prefix search and restores/downloads the closest matching cache that was recently created, when there's a miss on key.

For example, when this runs the first time and the corpus gets saved as fuzz-corpus-refs/heads/main-sha123, on the second run the key becomes fuzz-corpus-refs/heads/main-sha456 and will miss, so it falls back to the restore-keys to restore the most recent cache with the prefix fuzz-corpus-refs/heads/main-. That will download fuzz-corpus-refs/heads/main-sha123, the run will use that, updates it and save it as fuzz-corpus-refs/heads/main-sha456, and the loop continues.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhhhhh, okay, i wasn't clear that it does a prefix search, can you add a comment noting that? Otherwise LGTM!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I just updated the comment and pushed 5edb8cf.

Let me know when I can squash my second fixup commit into my first commit.

restore-keys: |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably when running on main we don't need the restore-keys trick?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably when running on main we don't need the restore-keys trick?

No, we still do. Because the save key includes the sha so it will always miss on restore, but the restore-keys will help restore the matching most recent cache.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but then where do we save in a way that something knows where to find it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The restore-keys does a prefix search and restores the most recently created cache with the prefix provided.

fuzz-corpus-refs/heads/main-
- name: Sanity check fuzz targets on Rust ${{ env.TOOLCHAIN }}
run: |
cd fuzz
RUSTFLAGS="--cfg=fuzzing --cfg=secp256k1_fuzz --cfg=hashes_fuzz" cargo test --verbose --color always --lib --bins -j8
cargo clean
- name: Run fuzzers
run: cd fuzz && ./ci-fuzz.sh && cd ..
- name: Upload honggfuzz corpus
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than only uploading, is there a way to make this directory persistent so that we can keep it between fuzz jobs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we really need to persist the directory here. My understanding is that the fuzz job runs on the latest code changes on every PR, so the generated corpus is tailored to the code changes on that PR. If we persist the corpus from a previous run and use that on a new run, won't that produce incorrect/misleading coverage data?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the point of the fuzz job is only to generate coverage data, but rather test the code :). Having a bit more coverage data from fuzzing than we "deserve" is okay, at least now that we split the coverage data out so that codecov shows fuzzing separately, and having persistent fuzzing corpus means our fuzzing is much more likely to catch issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, how long do you think we can have this directory persisted? The upload-artifact action have a retention-days input that can be used to persist the artifact for a while. The default is 90 days but can be adjusted (https://github.com/actions/upload-artifact?tab=readme-ov-file#retention-period).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the simple "upload-artifact" task just stores data for this CI run. What I was thinking is some kind of persistent directory that's shared across jobs so that each CI fuzz task picks up the latest directory, does some fuzzing, finds new test cases, then uploads a new copy with more tests in it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I was thinking is some kind of persistent directory that's shared across jobs so that each CI fuzz task picks up the latest directory, does some fuzzing, finds new test cases, then uploads a new copy with more tests in it.

Makes sense. I pushed eea2e4b to handle this using Github's cache action (https://github.com/actions/cache?tab=readme-ov-file).

uses: actions/upload-artifact@v4
with:
name: hfuzz-corpus
path: fuzz/hfuzz_workspace

linting:
runs-on: ubuntu-latest
Expand Down
23 changes: 20 additions & 3 deletions contrib/generate_fuzz_coverage.sh
Original file line number Diff line number Diff line change
Expand Up @@ -62,9 +62,26 @@ if [ "$OUTPUT_CODECOV_JSON" = "0" ]; then
cargo llvm-cov --html --ignore-filename-regex "fuzz/" --output-dir "$OUTPUT_DIR"
echo "Coverage report generated in $OUTPUT_DIR/html/index.html"
else
cargo llvm-cov -j8 --codecov --ignore-filename-regex "fuzz/" --output-path "$OUTPUT_DIR/fuzz-codecov.json"
echo "Fuzz codecov report available at $OUTPUT_DIR/fuzz-codecov.json"
fi
# Clean previous coverage artifacts to ensure a fresh run.
cargo llvm-cov clean --workspace

# Import honggfuzz corpus if the artifact was downloaded.
if [ -d "hfuzz_workspace" ]; then
echo "Importing corpus from hfuzz_workspace..."
for target_dir in hfuzz_workspace/*; do
[ -d "$target_dir" ] || continue
src_name="$(basename "$target_dir")"
dest="${src_name%_target}"
mkdir -p "test_cases/$dest"
# Copy corpus files into the test_cases directory
find "$target_dir" -maxdepth 2 -type f -path "$target_dir/input/*" \
-print0 | xargs -0 -I{} cp -n {} "test_cases/$dest/"
done
fi

echo "Replaying imported corpus (if found) via tests to generate coverage..."
cargo llvm-cov -j8 --codecov --ignore-filename-regex "fuzz/" \
--output-path "$OUTPUT_DIR/fuzz-codecov.json" --tests

echo "Fuzz codecov report available at $OUTPUT_DIR/fuzz-codecov.json"
fi
Loading