-
Notifications
You must be signed in to change notification settings - Fork 562
feat: support using valid set for input.json for dp test #4859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: devel
Are you sure you want to change the base?
feat: support using valid set for input.json for dp test #4859
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for using a validation dataset from a training input JSON file for the dp test
command. The implementation adds an --input-json
flag that allows users to specify a training configuration file, and the validation systems defined in that file will be used for testing instead of requiring separate system specification.
Key changes:
- Added
--input-json
command line argument to thedp test
command - Enhanced the test function to handle
rglob_patterns
when using input JSON files - Refactored existing test code to support both traditional and JSON-based testing approaches
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
deepmd/main.py | Added --input-json argument to the test command parser |
deepmd/entrypoints/test.py | Implemented logic to process validation systems from input JSON with rglob pattern support |
source/tests/pt/test_dp_test.py | Refactored tests and added new test cases for input JSON functionality |
Comments suppressed due to low confidence (1)
source/tests/pt/test_dp_test.py:33
- [nitpick] The parameter name 'use_input_json' could be more descriptive. Consider 'use_validation_from_json' or 'load_systems_from_json' to better indicate its purpose.
def _run_dp_test(self, use_input_json: bool, numb_test: int = 0) -> None:
val_params = jdata.get("training", {}).get("validation_data", {}) | ||
validation = val_params.get("systems") | ||
if not validation: | ||
raise RuntimeError("No validation data found in input json") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message should be more specific about where validation data should be located in the JSON structure. Consider: "No validation systems found in training.validation_data.systems of input json file"
raise RuntimeError("No validation data found in input json") | |
raise RuntimeError("No validation systems found in training.validation_data.systems of input json file") |
Copilot uses AI. Check for mistakes.
val_sys = self.config["training"]["validation_data"]["systems"] | ||
if isinstance(val_sys, list): | ||
val_sys = val_sys[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code duplicates the validation system extraction logic that's already implemented in the main test function. Consider extracting this into a helper method to avoid duplication.
val_sys = self.config["training"]["validation_data"]["systems"] | |
if isinstance(val_sys, list): | |
val_sys = val_sys[0] | |
val_sys = self._get_validation_system() |
Copilot uses AI. Check for mistakes.
📝 WalkthroughWalkthroughThe changes introduce support for specifying validation systems via an input JSON file in the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI
participant test()
participant j_loader
participant process_systems
User->>CLI: Run test command (with or without --input-json)
CLI->>test(): Call test() with input_json or datafile
alt input_json provided
test()->>j_loader: Load JSON file
j_loader-->>test(): Return parsed JSON
test()->>process_systems: Extract and resolve systems
process_systems-->>test(): Return system list
else datafile provided
test()->>test(): Read systems from datafile
else
test()->>test(): Expand system string
end
test()->>test(): Proceed with testing using system list
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~15 minutes Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (4)📓 Common learnings
📚 Learning: in the file `deepmd/pd/infer/inference.py`, when loading the model checkpoint in the `tester` class,...
Applied to files:
📚 Learning: in the deepmd project, entry points like `deepmd.jax` may be registered in external projects, so the...
Applied to files:
📚 Learning: the function `nvprof_context` is defined in `deepmd/pd/utils/utils.py`, so importing it in `deepmd/p...
Applied to files:
🪛 Ruff (0.12.2)deepmd/entrypoints/test.py113-113: Undefined name (F821) ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)
🔇 Additional comments (3)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## devel #4859 +/- ##
=======================================
Coverage 84.76% 84.76%
=======================================
Files 699 699
Lines 68077 68091 +14
Branches 3541 3541
=======================================
+ Hits 57708 57720 +12
- Misses 9235 9236 +1
- Partials 1134 1135 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use add_mutually_exclusive_group
to group input_json
, data_file
, and system
.
"--input-json", | ||
default=None, | ||
type=str, | ||
help="The training input json file. Validation data in the training script will be used for testing.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear why validation data is used other than training data. I suggest adding two options.
Signed-off-by: Chun Cai <amoycaic@gmail.com>
Summary by CodeRabbit
New Features
Bug Fixes
Tests