Skip to content

chore: load cpu architecture dependent models #1800

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2,533 commits into from

Conversation

rootfs
Copy link
Contributor

@rootfs rootfs commented Sep 26, 2024

This is to prepare for coming CPU dependent models

sthaha and others added 30 commits July 10, 2024 09:33
…ium-ebpf

chore(pkg/bpf): Replace libbpfgo with cilium/ebpf
…updates

Bumps the github-actions group with 2 updates in the / directory: [anchore/sbom-action](https://github.com/anchore/sbom-action) and [actions/upload-artifact](https://github.com/actions/upload-artifact).


Updates `anchore/sbom-action` from 0.16.0 to 0.16.1
- [Release notes](https://github.com/anchore/sbom-action/releases)
- [Commits](anchore/sbom-action@v0.16.0...v0.16.1)

Updates `actions/upload-artifact` from 4.3.0 to 4.3.4
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](actions/upload-artifact@v4.3.0...v4.3.4)

---
updated-dependencies:
- dependency-name: anchore/sbom-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
This commit adds scaphandre to the dev compose so that power predication
can be compared against it. Additionally grafana dashboards is updated
to compare `process_package` power against scaphandre's process power
consumption.

Signed-off-by: Sunil Thaha <sthaha@redhat.com>
…-dev

chore(compose): add scaphandre to dev compose
Cleaned YAML files to remove trailing lines.

Signed-off-by: Kaiyi <kaiyiliu21@gmail.com>
Cleaned YAML files to resolve yamllint action errors.

Signed-off-by: Kaiyi <kaiyiliu21@gmail.com>
…puting-io/dependabot/github_actions/github-actions-b63beb1316

build(deps): bump the github-actions group across 1 directory with 2 updates
Remove cache clearing and update from scaph dockerfile.

Signed-off-by: Kaiyi <kaiyiliu21@gmail.com>
…efile-macos

fix: Makefile for running on macOS
This commit adds a more comprehensive eBPF test suite.
Currently it tests the operation of a number of key functions
within the eBPF code - for example the main sched_switch
tracepoint that we run. In addition, it runs a number
of micro benchmarks so we can track performance of these
key pieces of code.

Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
This commit addresses and resolves various linting issues in the validator module
Additionally, it includes the following improvements:

- Add a new make target to run the linter.
- Add `__init__.py` to the `tests/validator` directory
  to resolve the linting issue: implicit-namespace-package (INP001).
- Suppress certain linting issues that are intentional or not applicable in our context.

Signed-off-by: vprashar2929 <vprashar@redhat.com>
Refactor pkg/sensors/accelerator to use a more generic device
abstractions that different devices can plug into.

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
Removed Autoheal feature of the health check.

Signed-off-by: Kaiyi <kaiyiliu21@gmail.com>
Added back apt-get update back to scaph dockerfile as it
fails to install curl without it.

Signed-off-by: Kaiyi <kaiyiliu21@gmail.com>
…x-lint

fix(validator): resolve linting issues
This commit allows grafana to be accessed without logging in as
admin user. It also solves the nagging change password issue.

Signed-off-by: Sunil Thaha <sthaha@redhat.com>
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
…ecommit-shellcheck-mardown

chore: precommit add markdown + spellcheck
Bumps the go-dependencies group with 1 update: [github.com/prometheus/prometheus](https://github.com/prometheus/prometheus).


Updates `github.com/prometheus/prometheus` from 0.53.0 to 0.53.1
- [Release notes](https://github.com/prometheus/prometheus/releases)
- [Changelog](https://github.com/prometheus/prometheus/blob/main/CHANGELOG.md)
- [Commits](prometheus/prometheus@v0.53.0...v0.53.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/prometheus
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: go-dependencies
...

Signed-off-by: dependabot[bot] <support@github.com>
…puting-io/dependabot/go_modules/go-dependencies-d988561a6c

build(deps): bump github.com/prometheus/prometheus from 0.53.0 to 0.53.1 in the go-dependencies group
…afana-login

chore(compose/grafana): allow anonymous login with admin role
This commit improves the README with detailed instructions on how to set up
and run the Docker Compose for VM validations. Additionally, it updates the
steps required to launch the validator tool

Signed-off-by: vprashar2929 <vibhu.sharma2929@gmail.com>
…l-readme

docs(validator): update the documentation for running the validator
Signed-off-by: Vimal Kumar <vimal78@gmail.com>
Signed-off-by: Vimal Kumar <vimal78@gmail.com>
…te-mock-acpi

feat: Add mock-acpi validation to validator
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
maryamtahhan and others added 18 commits September 17, 2024 05:00
The builder stage for the Kepler image needs to have
the dcgm/habana libraries installed for the build
tags to work.

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
…gm-build

fixes: aa66ada adding the needed libraries to the builder stage
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
…puting-io/dependabot/go_modules/go-dependencies-f1759efdc8

build(deps): bump the go-dependencies group across 1 directory with 6 updates
…roup-cleanup-globals

chore: cleanup globals in pkg/cgroup
Signed-off-by: Vimal Kumar <vimal78@gmail.com>
…-bpf-metrics

fix(metrics): Remove resource usage check for skipping bpf metrics
Signed-off-by: Vimal Kumar <vimal78@gmail.com>
Signed-off-by: Vimal Kumar <vimal78@gmail.com>
Bumps the github-actions group with 1 update in the / directory: [securego/gosec](https://github.com/securego/gosec).


Updates `securego/gosec` from 2.20.0 to 2.21.3
- [Release notes](https://github.com/securego/gosec/releases)
- [Changelog](https://github.com/securego/gosec/blob/master/.goreleaser.yml)
- [Commits](securego/gosec@6fbd381...be8bd6e)

---
updated-dependencies:
- dependency-name: securego/gosec
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
…bvirt-exporter

chore(compose): Add libvirt-exporter
…puting-io/dependabot/github_actions/github-actions-c2deb92154

build(deps): bump securego/gosec from 2.20.0 to 2.21.3 in the github-actions group across 1 directory
… github-actions group across 1 directory"

Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
…puting-io/revert-1794-dependabot/github_actions/github-actions-c2deb92154

Revert "build(deps): bump securego/gosec from 2.20.0 to 2.21.3 in the github-actions group across 1 directory"
Signed-off-by: Maryam Tahhan <mtahhan@redhat.com>
…e-bpf-overhead

fix(bpf): exclude bpf overhead in bpf_cpu_time
…date-gosec-to-master

chore: update gosec to use master
Signed-off-by: Huamin Chen <hchen@redhat.com>
@rootfs rootfs requested review from sthaha and sunya-ch September 26, 2024 14:53
Copy link
Contributor

github-actions bot commented Sep 26, 2024

🤖 SeineSailor

Here is a concise summary of the pull request changes:

Summary: This pull request, "chore: load cpu architecture dependent models", enables loading of CPU architecture-dependent models by introducing a cpuArch parameter and incorporating node.CPUArchitecture() in various functions. The changes prepare the codebase for CPU-dependent models, affecting power model loading behavior based on CPU architecture.

Key Modifications:

  1. Added cpuArch parameter to GetDefaultPowerModelURL function.
  2. Incorporated node.CPUArchitecture() in multiple functions.
  3. Modified GetNodePlatformPowerFromDummyServer, GetNodeComponentsPowerFromDummyServer, and GetProcessComponentsPower functions to consider CPU architecture.
  4. Updated genRegressor function to load CPU architecture-specific models.

Impact: These changes will allow the codebase to load power models based on CPU architecture, potentially affecting the behavior of the code when loading power models.

Observations/Suggestions:

  • It would be beneficial to include additional tests to ensure the correct loading of CPU architecture-dependent models.
  • Consider documenting the implications of CPU architecture on power model loading behavior for future reference.
  • Review the modified functions to ensure they handle different CPU architectures correctly.

@rootfs rootfs requested a review from KaiyiLiu1234 September 26, 2024 15:02
cpuArch = strings.TrimSpace(cpuArch)
fullPath := fmt.Sprintf(`/var/lib/kepler/data/%s/model_weight/%s_%sModel.json`, cpuArch, energySource, modelOutputType)
// if the model does not exist, return the default model
if _, err := os.Stat(fullPath); os.IsNotExist(err) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this result in a possible race condition? If you can guarantee that after the os.Stat the file exists, then ignore this comment.

return fmt.Sprintf(`/var/lib/kepler/data/model_weight/%s_%sModel.json`, energySource, modelOutputType)
func GetDefaultPowerModelURL(modelOutputType, energySource, cpuArch string) string {
// strip white space or new line from cpuArch
cpuArch = strings.TrimSuffix(cpuArch, "\n")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TrimSpace should remove \n characters and white space, so you can remove this line

cpuArch = strings.TrimSpace(cpuArch)
fullPath := fmt.Sprintf(`/var/lib/kepler/data/%s/model_weight/%s_%sModel.json`, cpuArch, energySource, modelOutputType)
// if the model does not exist, return the default model
if _, err := os.Stat(fullPath); os.IsNotExist(err) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a minor note, os.isNotExist(err) might be misleading as os.Stat can return other types of errors which is not the same as a file/model not existing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these files shouldn't change at all 🤔

Copy link
Collaborator

@sthaha sthaha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! but requesting change to remove the ebpf file changes.

@rootfs rootfs marked this pull request as draft November 1, 2024 00:10
@sthaha sthaha closed this Jul 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.