-
Notifications
You must be signed in to change notification settings - Fork 645
Qualcomm AI Engine Direct - Support simple_eval in calibration, perpl… #12958
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12958
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 New FailuresAs of commit 7a1a1d3 with merge base 9e00a51 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
Hi @cccclai, |
There are some jobs running less frequently like https://github.com/pytorch/executorch/blob/b6b7a16df5e7852d976d8c34c8a7e9a1b6f7d005/.github/workflows/periodic.yml and https://github.com/pytorch/executorch/blob/b6b7a16df5e7852d976d8c34c8a7e9a1b6f7d005/.github/workflows/nightly.yml they won't block CI. Maybe we can use these jobs |
4f7d594
to
dd3b173
Compare
Sure! We will support the CI in the future PR under these yml files. Thanks |
dd3b173
to
1cd8022
Compare
Hi @cccclai, There are some incoming PRs would like to let you know:
Thank you. |
bb81d56
to
f0e16d1
Compare
Hi @cccclai, After this PR is merged, we will push another PR that applies some optimization. With those optimizations, we should be able to get ppl score of 12 for QNN on device, which aligns with prepare_pt2e and convert_pt2e. |
43ce5b2
to
7a1a1d3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thank you!
Summary
llama.py
Will have a follow up PR to address:
Script
python examples/qualcomm/oss_scripts/llama/llama.py -b build-android -s $DEVICE -m SM8750 --prompt "What is 1+1?" --temperature 0 --model_mode kv --max_seq_len 1024 --ptq 16a8w --decoder_model qwen2_5 --eval_perplexity --tasks wikitext
Test plan
python backends/qualcomm/tests/test_qnn_delegate.py -k TestExampleLLMScript.test_static_qwen2_5 --model SM8650 --build_folder build-android/ --executorch_root . -s $DEVICE
Author: @shewu-quic, @winskuo-quic