Add seed parameter (optional) and custom evaluation metric for citations overlap #122
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
This pull request introduces several changes to add a new metric for evaluating citation matches and to incorporate a random seed for reproducibility in various functions. The most important changes include adding the new
CitationsMatchedMetricclass, updating configuration files to include the new metric, and modifying functions to accept and use a seed parameter.New Metric Addition:
evals/eval_config.json: Updated therequested_metricsto include"citations_matched"and added aseedparameter intarget_parameters.evals/evaluate.py: Added theCitationsMatchedMetricclass to compute the percentage of citations matched between the response and the ground truth. Registered this new metric in the evaluation pipeline. [1] [2]Seed Parameter for Reproducibility:
src/backend/fastapi_app/api_models.py: Added aseedfield to theChatRequestOverridesmodel.src/backend/fastapi_app/rag_advanced.py: Modified multiple functions (generate_search_query,prepare_context,answer) to accept and use theseedparameter. [1] [2] [3] [4]src/backend/fastapi_app/rag_base.py: Updated theget_paramsmethod to include theseedparameter.src/backend/fastapi_app/rag_simple.py: Added theseedparameter to theanswerandanswer_streammethods. [1] [2]Does this introduce a breaking change?
When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.
Type of change
Code quality checklist
See CONTRIBUTING.md for more details.
python -m pytest).python -m pytest --covto verify 100% coverage of added linespython -m mypyto check for type errorsruffmanually on my code.