Skip to content

Commit e97c8ee

Browse files
authored
Merge pull request #183 from firstbatchxyz/kasim
specs
2 parents 8a2c88e + 2dadb73 commit e97c8ee

File tree

1 file changed

+226
-42
lines changed

1 file changed

+226
-42
lines changed

docs/NODE_SPECS.md

Lines changed: 226 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -4,59 +4,245 @@ Hello, Drians! 👋 Here's a guide to help you understand the minimum specs need
44

55
- ## 🖥️ GPU-Enabled Nodes
66

7-
These specs are based on a system with 16 CPUs and 64GB RAM.
7+
### RTX3090 Single GPU:
88

9-
| Model | GPU Memory | CPU Usage (cores) | RAM Usage |
10-
| -------------- | -------------- | ----------------- | ------------ |
11-
| Llama3_1_8B | 6.1 - 6.2 GB | 8.6 - 12.8 cores | 8.5 GB |
12-
| Phi3Mini | 3.3 - 3.4 GB | 14.4 - 22.5 cores | 7.7 GB |
13-
| Phi3Medium128k | 10.9 - 11.0 GB | 7.9 - 11.4 cores | 5.3 GB |
14-
| Phi3Medium | 10.9 - 11.0 GB | 4.3 - 5.7 cores | 5.3 GB |
15-
| NousTheta | 9.6 GB | 4.1 - 4.8 cores | 6.4 - 6.6 GB |
9+
| Model | TPS |
10+
| ----------------------------------- | --------- |
11+
| finalend/hermes-3-llama-3.1:8b-q8_0 | 76.4388 |
12+
| phi3:14b-medium-4k-instruct-q4_1 | 75.6148 |
13+
| phi3:14b-medium-128k-instruct-q4_1 | 76.0658 |
14+
| phi3.5:3.8b | 195.0728 |
15+
| phi3.5:3.8b-mini-instruct-fp16 | 88.4656 |
16+
| gemma2:9b-instruct-q8_0 | 56.2726 |
17+
| gemma2:9b-instruct-fp16 | 37.9404 |
18+
| llama3.1:latest  | 103.3473 |
19+
| llama3.1:8b-instruct-q8_0 | 78.5861 |
20+
| llama3.1:8b-instruct-fp16 | 50.9302 |
21+
| llama3.1:8b-text-q4_K_M | 104.4776 |
22+
| llama3.1:8b-text-q8_0 | 82.3980 |
23+
| llama3.2:1b | 293.1785 |
24+
| llama3.2:3b | 168.7500 |
25+
| llama3.2:1b-text-q4_K_M | 349.2497 |
26+
| qwen2.5:7b-instruct-q5_0 | 114.0511 |
27+
| qwen2.5:7b-instruct-fp16 | 53.5423 |
28+
| qwen2.5-coder:1.5b | 238.6117 |
29+
| qwen2.5-coder:7b-instruct | 125.2194 |
30+
| qwen2.5-coder:7b-instruct-q8_0 | 83.7696 |
31+
| qwen2.5-coder:7b-instruct-fp16 | 53.7400 |
32+
| qwq | 33.4434 |
33+
| deepseek-coder:6.7b | 141.7769 |
34+
| deepseek-r1:1.5b | 235.8560 |
35+
| deepseek-r1:7b | 121.9637 |
36+
| deepseek-r1:8b  | 107.5933 |
37+
| deepseek-r1:14b | 66.5972 |
38+
| deepseek-r1:32b | 34.4669 |
39+
| deepseek-r1 | 120.9809 |
40+
| driaforall/tiny-agent-a:0.5b | 279.2553 |
41+
| driaforall/tiny-agent-a:1.5b | 201.7011 |
42+
| driaforall/tiny-agent-a:3b | 135.1052 |
43+
44+
### H200 SXM Single GPU:
45+
46+
| Model | TPS |
47+
| ----------------------------------- | --------- |
48+
| finalend/hermes-3-llama-3.1:8b-q8_0 | 121.2871 |
49+
| phi3:14b-medium-4k-instruct-q4_1 | 128.9496 |
50+
| phi3:14b-medium-128k-instruct-q4_1 | 124.4223 |
51+
| phi3.5:3.8b | 184.3729 |
52+
| phi3.5:3.8b-mini-instruct-fp16 | 155.6164 |
53+
| gemma2:9b-instruct-q8_0 | 91.6370 |
54+
| gemma2:9b-instruct-fp16 | 85.6672 |
55+
| llama3.1:latest  | 123.8938 |
56+
| llama3.1:8b-instruct-q8_0 | 112.3102 |
57+
| llama3.1:8b-instruct-fp16 | 108.9053 |
58+
| llama3.1:8b-text-q4_K_M | 148.0687 |
59+
| llama3.1:8b-text-q8_0 | 135.3251 |
60+
| llama3.1:70b-instruct-q4_0 | 47.0107 |
61+
| llama3.1:70b-instruct-q8_0  | 35.2827 |
62+
| llama3.2:1b | 163.9058 |
63+
| llama3.2:3b | 150.6063 |
64+
| llama3.3:70b | 39.1993 |
65+
| llama3.2:1b-text-q4_K_M | 233.6957 |
66+
| qwen2.5:7b-instruct-q5_0 | 126.5432 |
67+
| qwen2.5:7b-instruct-fp16 | 103.8552 |
68+
| qwen2.5:32b-instruct-fp16 | 40.3735 |
69+
| qwen2.5-coder:1.5b | 187.3554 |
70+
| qwen2.5-coder:7b-instruct | 119.7279 |
71+
| qwen2.5-coder:7b-instruct-q8_0 | 108.9536 |
72+
| qwen2.5-coder:7b-instruct-fp16 | 104.0222 |
73+
| qwq | 59.4734 |
74+
| deepseek-coder:6.7b | 136.8015 |
75+
| mixtral:8x7b | 94.9618 |
76+
| deepseek-r1:1.5b | 160.8217 |
77+
| deepseek-r1:7b | 141.2172 |
78+
| deepseek-r1:8b  | 136.8324 |
79+
| deepseek-r1:14b | 90.3022 |
80+
| deepseek-r1:32b | 63.1900 |
81+
| deepseek-r1:70b  | 39.4153 |
82+
| deepseek-r1 | 121.8406 |
83+
| driaforall/tiny-agent-a:0.5b | 148.5390 |
84+
| driaforall/tiny-agent-a:1.5b | 180.9409 |
85+
| driaforall/tiny-agent-a:3b | 111.1869 |
1686

1787
- ## 💻 CPU-Only Nodes
1888

1989
For those running without a GPU, we've got you covered too! Here are the specs for different CPU types:
2090

21-
### ARM (4 CPU, 16GB RAM)
91+
### AMD (8 CPU, 16GB RAM)
2292

23-
| Model | CPU Usage (cores) | RAM Usage |
24-
| -------------- | ----------------- | ------------- |
25-
| NousTheta | 3.0 - 3.5 cores | 9.6 GB |
26-
| Phi3Medium | 3.7 - 3.8 cores | 10.4 GB |
27-
| Phi3Medium128k | 3.7 - 3.8 cores | 10.4 GB |
28-
| Phi3Mini | 3.2 - 6.1 cores | 5.6 - 11.4 GB |
29-
| Llama3_1_8B | 3.4 - 3.7 cores | 6.1 GB |
93+
| Model | TPS |
94+
| ----------------------------------- | --------- |
95+
| llama3.2:1b | 22.6293 |
96+
| llama3.2:1b-text-q4_K_M | 25.0413 |
97+
| qwen2.5-coder:1.5b | 21.7418 |
98+
| deepseek-r1:1.5b | 29.7842 |
99+
| driaforall/tiny-agent-a:0.5b | 54.5455 |
100+
| driaforall/tiny-agent-a:1.5b | 19.9501 |
101+
### AMD (16 CPU, 32GB RAM)
30102

31-
### ARM (8 CPU, 16GB RAM)
103+
| Model | TPS |
104+
| ----------------------------------- | --------- |
105+
| phi3.5:3.8b | 15.3677 |
106+
| llama3.2:1b | 25.6367 |
107+
| llama3.2:3b | 16.3185 |
108+
| llama3.2:1b-text-q4_K_M | 38.0039 |
109+
| qwen2.5-coder:1.5b | 30.3651 |
110+
| deepseek-r1:1.5b | 30.2977 |
111+
| driaforall/tiny-agent-a:0.5b | 61.2553 |
112+
| driaforall/tiny-agent-a:1.5b | 25.7011 |
32113

33-
| Model | CPU Usage (cores) | RAM Usage |
34-
| -------------- | ----------------- | ------------- |
35-
| NousTheta | 6.2 - 6.3 cores | 9.6 GB |
36-
| Phi3Medium | 6.5 cores | 10.8 GB |
37-
| Phi3Medium128k | 6.5 cores | 10.8 GB |
38-
| Phi3Mini | 5.4 - 7.0 cores | 5.8 - 11.6 GB |
39-
| Llama3_1_8B | 3.4 - 4.2 cores | 6.2 GB |
114+
### AMD (32 CPU, 64GB RAM)
40115

41-
### AMD (8 CPU, 16GB RAM)
116+
| Model | TPS |
117+
| ----------------------------------- | --------- |
118+
| phi3.5:3.8b | 22.9944 |
119+
| llama3.2:1b | 40.6091 |
120+
| llama3.2:3b | 26.0240 |
121+
| llama3.2:1b-text-q4_K_M | 56.2027 |
122+
| qwen2.5-coder:1.5b | 44.6331 |
123+
| deepseek-coder:6.7b | 15.1620 |
124+
| deepseek-r1:1.5b | 43.8323 |
125+
| driaforall/tiny-agent-a:0.5b | 59.9854 |
126+
| driaforall/tiny-agent-a:1.5b | 27.7891 |
127+
128+
129+
### AMD (48 CPU, 96GB RAM)
130+
131+
| Model | TPS |
132+
| ----------------------------------- | --------- |
133+
| phi3.5:3.8b | 29.7455 |
134+
| llama3.1:latest  | 17.4744 |
135+
| llama3.1:8b-text-q4_K_M | 18.1928 |
136+
| llama3.2:1b | 49.1555 |
137+
| llama3.2:3b | 33.9283 |
138+
| llama3.2:1b-text-q4_K_M | 72.7273 |
139+
| qwen2.5:7b-instruct-q5_0 | 17.0779 |
140+
| qwen2.5-coder:1.5b | 56.2710 |
141+
| qwen2.5-coder:7b-instruct | 18.2935 |
142+
| deepseek-coder:6.7b | 21.2014 |
143+
| deepseek-r1:1.5b | 55.0080 |
144+
| deepseek-r1:7b | 18.0150 |
145+
| deepseek-r1:8b  | 16.4574 |
146+
| deepseek-r1 | 18.0991 |
147+
| driaforall/tiny-agent-a:0.5b | 86.2903 |
148+
| driaforall/tiny-agent-a:1.5b | 41.6198 |
149+
| driaforall/tiny-agent-a:3b | 24.1364 |
150+
151+
### AMD (64 CPU, 128GB RAM)
152+
153+
| Model | TPS |
154+
| ----------------------------------- | --------- |
155+
| phi3.5:3.8b | 33.8993 |
156+
| llama3.1:latest  | 19.3015 |
157+
| llama3.1:8b-text-q4_K_M | 19.9081 |
158+
| llama3.2:1b | 55.6815 |
159+
| llama3.2:3b | 36.6654 |
160+
| llama3.2:1b-text-q4_K_M | 68.9655 |
161+
| qwen2.5:7b-instruct-q5_0 | 18.0591 |
162+
| qwen2.5-coder:1.5b | 56.7301 |
163+
| qwen2.5-coder:7b-instruct | 20.1563 |
164+
| deepseek-coder:6.7b | 23.4261 |
165+
| deepseek-r1:1.5b | 57.0494 |
166+
| deepseek-r1:7b | 20.3577 |
167+
| deepseek-r1:8b  | 18.6653 |
168+
| deepseek-r1 | 20.2571 |
169+
| driaforall/tiny-agent-a:0.5b | 94.6503 |
170+
| driaforall/tiny-agent-a:1.5b | 49.5431 |
171+
| driaforall/tiny-agent-a:3b | 27.1564 |
172+
173+
### AMD (96 CPU, 192GB RAM)
174+
175+
| Model | TPS |
176+
| ----------------------------------- | --------- |
177+
| phi3.5:3.8b | 34.1058 |
178+
| llama3.1:latest  | 20.2221 |
179+
| llama3.1:8b-text-q4_K_M | 20.1473 |
180+
| llama3.2:1b | 54.5232 |
181+
| llama3.2:3b | 37.6344 |
182+
| llama3.2:1b-text-q4_K_M | 65.7570 |
183+
| qwen2.5:7b-instruct-q5_0 | 20.2058 |
184+
| qwen2.5-coder:1.5b | 55.4435 |
185+
| qwen2.5-coder:7b-instruct | 21.3058 |
186+
| deepseek-coder:6.7b | 24.6414 |
187+
| deepseek-r1:1.5b | 54.3133 |
188+
| deepseek-r1:7b | 20.8902 |
189+
| deepseek-r1:8b  | 18.7142 |
190+
| deepseek-r1 | 22.1564 |
191+
| driaforall/tiny-agent-a:0.5b | 94.7864 |
192+
| driaforall/tiny-agent-a:1.5b | 50.7868 |
193+
| driaforall/tiny-agent-a:3b | 29.4635 |
194+
195+
### AMD (192 CPU, 384GB RAM)
196+
197+
| Model | TPS |
198+
| ----------------------------------- | --------- |
199+
| finalend/hermes-3-llama-3.1:8b-q8_0 | 16.8002 |
200+
| phi3.5:3.8b | 26.2855 |
201+
| phi3.5:3.8b-mini-instruct-fp16 | 16.7343 |
202+
| llama3.1:latest  | 21.9456 |
203+
| llama3.1:8b-instruct-q8_0 | 16.7135 |
204+
| llama3.1:8b-text-q4_K_M | 22.5764 |
205+
| llama3.1:8b-text-q8_0 | 16.3817 |
206+
| llama3.2:1b | 43.5632 |
207+
| llama3.2:3b | 29.5560 |
208+
| llama3.2:1b-text-q4_K_M | 48.6348 |
209+
| qwen2.5:7b-instruct-q5_0 | 21.4938 |
210+
| qwen2.5-coder:1.5b | 33.3333 |
211+
| qwen2.5-coder:7b-instruct | 21.7933 |
212+
| qwen2.5-coder:7b-instruct-q8_0 | 17.8134 |
213+
| deepseek-coder:6.7b | 23.4474 |
214+
| deepseek-r1:1.5b | 32.7795 |
215+
| deepseek-r1:7b | 22.5376 |
216+
| deepseek-r1:8b  | 20.3057 |
217+
| deepseek-r1 | 23.0604 |
218+
| driaforall/tiny-agent-a:0.5b | 42.1866 |
219+
| driaforall/tiny-agent-a:1.5b | 33.4957 |
220+
| driaforall/tiny-agent-a:3b | 24.5138 |
221+
222+
### ARM (192 CPU, 384GB RAM)
223+
224+
| Model | TPS |
225+
| ----------------------------------- | --------- |
226+
| phi3.5:3.8b | 26.3062 |
227+
| llama3.1:latest  | 18.9597 |
228+
| llama3.1:8b-text-q4_K_M | 18.2489 |
229+
| llama3.2:1b | 43.7856 |
230+
| llama3.2:3b | 30.3443 |
231+
| llama3.2:1b-text-q4_K_M | 49.6852 |
232+
| qwen2.5:7b-instruct-q5_0 | 16.8128 |
233+
| qwen2.5-coder:1.5b | 38.3562 |
234+
| qwen2.5-coder:7b-instruct | 19.5582 |
235+
| deepseek-coder:6.7b | 21.2699 |
236+
| deepseek-r1:1.5b | 36.0020 |
237+
| deepseek-r1:7b | 19.5293 |
238+
| deepseek-r1:8b  | 18.5300 |
239+
| deepseek-r1 | 18.9405 |
240+
| driaforall/tiny-agent-a:0.5b | 28.4991 |
241+
| driaforall/tiny-agent-a:1.5b | 31.6353 |
242+
| driaforall/tiny-agent-a:3b | 22.2788 |
42243

43-
| Model | CPU Usage (cores) | RAM Usage |
44-
| -------------- | ----------------- | ------------- |
45-
| NousTheta | 2.3 - 3.2 cores | 9.5 GB |
46-
| Phi3Medium | 3.3 - 3.4 cores | 10.3 GB |
47-
| Phi3Medium128k | 1.6 - 3.2 cores | 10.2 GB |
48-
| Phi3Mini | 2.8 - 3.1 cores | 5.4 - 11.4 GB |
49-
| Llama3_1_8B | 4.5 - 4.6 cores | 11.1 GB |
50244

51-
### Intel (8 CPU, 16GB RAM)
52245

53-
| Model | CPU Usage (cores) | RAM Usage |
54-
| -------------- | ----------------- | ------------- |
55-
| NousTheta | 2.3 - 2.9 cores | 9.7 GB |
56-
| Phi3Medium | 3.1 - 3.3 cores | 10.4 GB |
57-
| Phi3Medium128k | 2.2 - 3.3 cores | 10.3 GB |
58-
| Phi3Mini | 2.6 - 4.1 cores | 5.4 - 11.0 GB |
59-
| Llama3_1_8B | 3.7 - 3.9 cores | 11.3 GB |
60246

61247
## 📝 Notes
62248

@@ -66,8 +252,6 @@ For those running without a GPU, we've got you covered too! Here are the specs f
66252

67253
- RAM usage is generally consistent but can spike for certain operations.
68254

69-
- **Important**: For systems with 4 CPUs and 8GB RAM, only Phi3Mini was able to run successfully.\*\*
70-
71255
- **Important**: Lower CPU count results in lower performance. Systems with fewer CPUs will process requests more slowly, especially for models that require more CPU resources than are available.
72256

73257
Remember, these are minimum specs, and your experience may vary depending on the specific tasks and workload. Happy node running! 🎉

0 commit comments

Comments
 (0)