@@ -4,59 +4,245 @@ Hello, Drians! 👋 Here's a guide to help you understand the minimum specs need
4
4
5
5
- ## 🖥️ GPU-Enabled Nodes
6
6
7
- These specs are based on a system with 16 CPUs and 64GB RAM.
7
+ ### RTX3090 Single GPU:
8
8
9
- | Model | GPU Memory | CPU Usage (cores) | RAM Usage |
10
- | -------------- | -------------- | ----------------- | ------------ |
11
- | Llama3_1_8B | 6.1 - 6.2 GB | 8.6 - 12.8 cores | 8.5 GB |
12
- | Phi3Mini | 3.3 - 3.4 GB | 14.4 - 22.5 cores | 7.7 GB |
13
- | Phi3Medium128k | 10.9 - 11.0 GB | 7.9 - 11.4 cores | 5.3 GB |
14
- | Phi3Medium | 10.9 - 11.0 GB | 4.3 - 5.7 cores | 5.3 GB |
15
- | NousTheta | 9.6 GB | 4.1 - 4.8 cores | 6.4 - 6.6 GB |
9
+ | Model | TPS |
10
+ | ----------------------------------- | --------- |
11
+ | finalend/hermes-3-llama-3.1:8b-q8_0 | 76.4388 |
12
+ | phi3:14b-medium-4k-instruct-q4_1 | 75.6148 |
13
+ | phi3:14b-medium-128k-instruct-q4_1 | 76.0658 |
14
+ | phi3.5:3.8b | 195.0728 |
15
+ | phi3.5:3.8b-mini-instruct-fp16 | 88.4656 |
16
+ | gemma2:9b-instruct-q8_0 | 56.2726 |
17
+ | gemma2:9b-instruct-fp16 | 37.9404 |
18
+ | llama3.1: latest | 103.3473 |
19
+ | llama3.1:8b-instruct-q8_0 | 78.5861 |
20
+ | llama3.1:8b-instruct-fp16 | 50.9302 |
21
+ | llama3.1:8b-text-q4_K_M | 104.4776 |
22
+ | llama3.1:8b-text-q8_0 | 82.3980 |
23
+ | llama3.2:1b | 293.1785 |
24
+ | llama3.2:3b | 168.7500 |
25
+ | llama3.2:1b-text-q4_K_M | 349.2497 |
26
+ | qwen2.5:7b-instruct-q5_0 | 114.0511 |
27
+ | qwen2.5:7b-instruct-fp16 | 53.5423 |
28
+ | qwen2.5-coder:1.5b | 238.6117 |
29
+ | qwen2.5-coder:7b-instruct | 125.2194 |
30
+ | qwen2.5-coder:7b-instruct-q8_0 | 83.7696 |
31
+ | qwen2.5-coder:7b-instruct-fp16 | 53.7400 |
32
+ | qwq | 33.4434 |
33
+ | deepseek-coder:6.7b | 141.7769 |
34
+ | deepseek-r1:1.5b | 235.8560 |
35
+ | deepseek-r1:7b | 121.9637 |
36
+ | deepseek-r1:8b | 107.5933 |
37
+ | deepseek-r1:14b | 66.5972 |
38
+ | deepseek-r1:32b | 34.4669 |
39
+ | deepseek-r1 | 120.9809 |
40
+ | driaforall/tiny-agent-a:0.5b | 279.2553 |
41
+ | driaforall/tiny-agent-a:1.5b | 201.7011 |
42
+ | driaforall/tiny-agent-a:3b | 135.1052 |
43
+
44
+ ### H200 SXM Single GPU:
45
+
46
+ | Model | TPS |
47
+ | ----------------------------------- | --------- |
48
+ | finalend/hermes-3-llama-3.1:8b-q8_0 | 121.2871 |
49
+ | phi3:14b-medium-4k-instruct-q4_1 | 128.9496 |
50
+ | phi3:14b-medium-128k-instruct-q4_1 | 124.4223 |
51
+ | phi3.5:3.8b | 184.3729 |
52
+ | phi3.5:3.8b-mini-instruct-fp16 | 155.6164 |
53
+ | gemma2:9b-instruct-q8_0 | 91.6370 |
54
+ | gemma2:9b-instruct-fp16 | 85.6672 |
55
+ | llama3.1: latest | 123.8938 |
56
+ | llama3.1:8b-instruct-q8_0 | 112.3102 |
57
+ | llama3.1:8b-instruct-fp16 | 108.9053 |
58
+ | llama3.1:8b-text-q4_K_M | 148.0687 |
59
+ | llama3.1:8b-text-q8_0 | 135.3251 |
60
+ | llama3.1:70b-instruct-q4_0 | 47.0107 |
61
+ | llama3.1:70b-instruct-q8_0 | 35.2827 |
62
+ | llama3.2:1b | 163.9058 |
63
+ | llama3.2:3b | 150.6063 |
64
+ | llama3.3:70b | 39.1993 |
65
+ | llama3.2:1b-text-q4_K_M | 233.6957 |
66
+ | qwen2.5:7b-instruct-q5_0 | 126.5432 |
67
+ | qwen2.5:7b-instruct-fp16 | 103.8552 |
68
+ | qwen2.5:32b-instruct-fp16 | 40.3735 |
69
+ | qwen2.5-coder:1.5b | 187.3554 |
70
+ | qwen2.5-coder:7b-instruct | 119.7279 |
71
+ | qwen2.5-coder:7b-instruct-q8_0 | 108.9536 |
72
+ | qwen2.5-coder:7b-instruct-fp16 | 104.0222 |
73
+ | qwq | 59.4734 |
74
+ | deepseek-coder:6.7b | 136.8015 |
75
+ | mixtral:8x7b | 94.9618 |
76
+ | deepseek-r1:1.5b | 160.8217 |
77
+ | deepseek-r1:7b | 141.2172 |
78
+ | deepseek-r1:8b | 136.8324 |
79
+ | deepseek-r1:14b | 90.3022 |
80
+ | deepseek-r1:32b | 63.1900 |
81
+ | deepseek-r1:70b | 39.4153 |
82
+ | deepseek-r1 | 121.8406 |
83
+ | driaforall/tiny-agent-a:0.5b | 148.5390 |
84
+ | driaforall/tiny-agent-a:1.5b | 180.9409 |
85
+ | driaforall/tiny-agent-a:3b | 111.1869 |
16
86
17
87
- ## 💻 CPU-Only Nodes
18
88
19
89
For those running without a GPU, we've got you covered too! Here are the specs for different CPU types:
20
90
21
- ### ARM (4 CPU, 16GB RAM)
91
+ ### AMD (8 CPU, 16GB RAM)
22
92
23
- | Model | CPU Usage (cores) | RAM Usage |
24
- | -------------- | ----------------- | ------------- |
25
- | NousTheta | 3.0 - 3.5 cores | 9.6 GB |
26
- | Phi3Medium | 3.7 - 3.8 cores | 10.4 GB |
27
- | Phi3Medium128k | 3.7 - 3.8 cores | 10.4 GB |
28
- | Phi3Mini | 3.2 - 6.1 cores | 5.6 - 11.4 GB |
29
- | Llama3_1_8B | 3.4 - 3.7 cores | 6.1 GB |
93
+ | Model | TPS |
94
+ | ----------------------------------- | --------- |
95
+ | llama3.2:1b | 22.6293 |
96
+ | llama3.2:1b-text-q4_K_M | 25.0413 |
97
+ | qwen2.5-coder:1.5b | 21.7418 |
98
+ | deepseek-r1:1.5b | 29.7842 |
99
+ | driaforall/tiny-agent-a:0.5b | 54.5455 |
100
+ | driaforall/tiny-agent-a:1.5b | 19.9501 |
101
+ ### AMD (16 CPU, 32GB RAM)
30
102
31
- ### ARM (8 CPU, 16GB RAM)
103
+ | Model | TPS |
104
+ | ----------------------------------- | --------- |
105
+ | phi3.5:3.8b | 15.3677 |
106
+ | llama3.2:1b | 25.6367 |
107
+ | llama3.2:3b | 16.3185 |
108
+ | llama3.2:1b-text-q4_K_M | 38.0039 |
109
+ | qwen2.5-coder:1.5b | 30.3651 |
110
+ | deepseek-r1:1.5b | 30.2977 |
111
+ | driaforall/tiny-agent-a:0.5b | 61.2553 |
112
+ | driaforall/tiny-agent-a:1.5b | 25.7011 |
32
113
33
- | Model | CPU Usage (cores) | RAM Usage |
34
- | -------------- | ----------------- | ------------- |
35
- | NousTheta | 6.2 - 6.3 cores | 9.6 GB |
36
- | Phi3Medium | 6.5 cores | 10.8 GB |
37
- | Phi3Medium128k | 6.5 cores | 10.8 GB |
38
- | Phi3Mini | 5.4 - 7.0 cores | 5.8 - 11.6 GB |
39
- | Llama3_1_8B | 3.4 - 4.2 cores | 6.2 GB |
114
+ ### AMD (32 CPU, 64GB RAM)
40
115
41
- ### AMD (8 CPU, 16GB RAM)
116
+ | Model | TPS |
117
+ | ----------------------------------- | --------- |
118
+ | phi3.5:3.8b | 22.9944 |
119
+ | llama3.2:1b | 40.6091 |
120
+ | llama3.2:3b | 26.0240 |
121
+ | llama3.2:1b-text-q4_K_M | 56.2027 |
122
+ | qwen2.5-coder:1.5b | 44.6331 |
123
+ | deepseek-coder:6.7b | 15.1620 |
124
+ | deepseek-r1:1.5b | 43.8323 |
125
+ | driaforall/tiny-agent-a:0.5b | 59.9854 |
126
+ | driaforall/tiny-agent-a:1.5b | 27.7891 |
127
+
128
+
129
+ ### AMD (48 CPU, 96GB RAM)
130
+
131
+ | Model | TPS |
132
+ | ----------------------------------- | --------- |
133
+ | phi3.5:3.8b | 29.7455 |
134
+ | llama3.1: latest | 17.4744 |
135
+ | llama3.1:8b-text-q4_K_M | 18.1928 |
136
+ | llama3.2:1b | 49.1555 |
137
+ | llama3.2:3b | 33.9283 |
138
+ | llama3.2:1b-text-q4_K_M | 72.7273 |
139
+ | qwen2.5:7b-instruct-q5_0 | 17.0779 |
140
+ | qwen2.5-coder:1.5b | 56.2710 |
141
+ | qwen2.5-coder:7b-instruct | 18.2935 |
142
+ | deepseek-coder:6.7b | 21.2014 |
143
+ | deepseek-r1:1.5b | 55.0080 |
144
+ | deepseek-r1:7b | 18.0150 |
145
+ | deepseek-r1:8b | 16.4574 |
146
+ | deepseek-r1 | 18.0991 |
147
+ | driaforall/tiny-agent-a:0.5b | 86.2903 |
148
+ | driaforall/tiny-agent-a:1.5b | 41.6198 |
149
+ | driaforall/tiny-agent-a:3b | 24.1364 |
150
+
151
+ ### AMD (64 CPU, 128GB RAM)
152
+
153
+ | Model | TPS |
154
+ | ----------------------------------- | --------- |
155
+ | phi3.5:3.8b | 33.8993 |
156
+ | llama3.1: latest | 19.3015 |
157
+ | llama3.1:8b-text-q4_K_M | 19.9081 |
158
+ | llama3.2:1b | 55.6815 |
159
+ | llama3.2:3b | 36.6654 |
160
+ | llama3.2:1b-text-q4_K_M | 68.9655 |
161
+ | qwen2.5:7b-instruct-q5_0 | 18.0591 |
162
+ | qwen2.5-coder:1.5b | 56.7301 |
163
+ | qwen2.5-coder:7b-instruct | 20.1563 |
164
+ | deepseek-coder:6.7b | 23.4261 |
165
+ | deepseek-r1:1.5b | 57.0494 |
166
+ | deepseek-r1:7b | 20.3577 |
167
+ | deepseek-r1:8b | 18.6653 |
168
+ | deepseek-r1 | 20.2571 |
169
+ | driaforall/tiny-agent-a:0.5b | 94.6503 |
170
+ | driaforall/tiny-agent-a:1.5b | 49.5431 |
171
+ | driaforall/tiny-agent-a:3b | 27.1564 |
172
+
173
+ ### AMD (96 CPU, 192GB RAM)
174
+
175
+ | Model | TPS |
176
+ | ----------------------------------- | --------- |
177
+ | phi3.5:3.8b | 34.1058 |
178
+ | llama3.1: latest | 20.2221 |
179
+ | llama3.1:8b-text-q4_K_M | 20.1473 |
180
+ | llama3.2:1b | 54.5232 |
181
+ | llama3.2:3b | 37.6344 |
182
+ | llama3.2:1b-text-q4_K_M | 65.7570 |
183
+ | qwen2.5:7b-instruct-q5_0 | 20.2058 |
184
+ | qwen2.5-coder:1.5b | 55.4435 |
185
+ | qwen2.5-coder:7b-instruct | 21.3058 |
186
+ | deepseek-coder:6.7b | 24.6414 |
187
+ | deepseek-r1:1.5b | 54.3133 |
188
+ | deepseek-r1:7b | 20.8902 |
189
+ | deepseek-r1:8b | 18.7142 |
190
+ | deepseek-r1 | 22.1564 |
191
+ | driaforall/tiny-agent-a:0.5b | 94.7864 |
192
+ | driaforall/tiny-agent-a:1.5b | 50.7868 |
193
+ | driaforall/tiny-agent-a:3b | 29.4635 |
194
+
195
+ ### AMD (192 CPU, 384GB RAM)
196
+
197
+ | Model | TPS |
198
+ | ----------------------------------- | --------- |
199
+ | finalend/hermes-3-llama-3.1:8b-q8_0 | 16.8002 |
200
+ | phi3.5:3.8b | 26.2855 |
201
+ | phi3.5:3.8b-mini-instruct-fp16 | 16.7343 |
202
+ | llama3.1: latest | 21.9456 |
203
+ | llama3.1:8b-instruct-q8_0 | 16.7135 |
204
+ | llama3.1:8b-text-q4_K_M | 22.5764 |
205
+ | llama3.1:8b-text-q8_0 | 16.3817 |
206
+ | llama3.2:1b | 43.5632 |
207
+ | llama3.2:3b | 29.5560 |
208
+ | llama3.2:1b-text-q4_K_M | 48.6348 |
209
+ | qwen2.5:7b-instruct-q5_0 | 21.4938 |
210
+ | qwen2.5-coder:1.5b | 33.3333 |
211
+ | qwen2.5-coder:7b-instruct | 21.7933 |
212
+ | qwen2.5-coder:7b-instruct-q8_0 | 17.8134 |
213
+ | deepseek-coder:6.7b | 23.4474 |
214
+ | deepseek-r1:1.5b | 32.7795 |
215
+ | deepseek-r1:7b | 22.5376 |
216
+ | deepseek-r1:8b | 20.3057 |
217
+ | deepseek-r1 | 23.0604 |
218
+ | driaforall/tiny-agent-a:0.5b | 42.1866 |
219
+ | driaforall/tiny-agent-a:1.5b | 33.4957 |
220
+ | driaforall/tiny-agent-a:3b | 24.5138 |
221
+
222
+ ### ARM (192 CPU, 384GB RAM)
223
+
224
+ | Model | TPS |
225
+ | ----------------------------------- | --------- |
226
+ | phi3.5:3.8b | 26.3062 |
227
+ | llama3.1: latest | 18.9597 |
228
+ | llama3.1:8b-text-q4_K_M | 18.2489 |
229
+ | llama3.2:1b | 43.7856 |
230
+ | llama3.2:3b | 30.3443 |
231
+ | llama3.2:1b-text-q4_K_M | 49.6852 |
232
+ | qwen2.5:7b-instruct-q5_0 | 16.8128 |
233
+ | qwen2.5-coder:1.5b | 38.3562 |
234
+ | qwen2.5-coder:7b-instruct | 19.5582 |
235
+ | deepseek-coder:6.7b | 21.2699 |
236
+ | deepseek-r1:1.5b | 36.0020 |
237
+ | deepseek-r1:7b | 19.5293 |
238
+ | deepseek-r1:8b | 18.5300 |
239
+ | deepseek-r1 | 18.9405 |
240
+ | driaforall/tiny-agent-a:0.5b | 28.4991 |
241
+ | driaforall/tiny-agent-a:1.5b | 31.6353 |
242
+ | driaforall/tiny-agent-a:3b | 22.2788 |
42
243
43
- | Model | CPU Usage (cores) | RAM Usage |
44
- | -------------- | ----------------- | ------------- |
45
- | NousTheta | 2.3 - 3.2 cores | 9.5 GB |
46
- | Phi3Medium | 3.3 - 3.4 cores | 10.3 GB |
47
- | Phi3Medium128k | 1.6 - 3.2 cores | 10.2 GB |
48
- | Phi3Mini | 2.8 - 3.1 cores | 5.4 - 11.4 GB |
49
- | Llama3_1_8B | 4.5 - 4.6 cores | 11.1 GB |
50
244
51
- ### Intel (8 CPU, 16GB RAM)
52
245
53
- | Model | CPU Usage (cores) | RAM Usage |
54
- | -------------- | ----------------- | ------------- |
55
- | NousTheta | 2.3 - 2.9 cores | 9.7 GB |
56
- | Phi3Medium | 3.1 - 3.3 cores | 10.4 GB |
57
- | Phi3Medium128k | 2.2 - 3.3 cores | 10.3 GB |
58
- | Phi3Mini | 2.6 - 4.1 cores | 5.4 - 11.0 GB |
59
- | Llama3_1_8B | 3.7 - 3.9 cores | 11.3 GB |
60
246
61
247
## 📝 Notes
62
248
@@ -66,8 +252,6 @@ For those running without a GPU, we've got you covered too! Here are the specs f
66
252
67
253
- RAM usage is generally consistent but can spike for certain operations.
68
254
69
- - ** Important** : For systems with 4 CPUs and 8GB RAM, only Phi3Mini was able to run successfully.\*\*
70
-
71
255
- ** Important** : Lower CPU count results in lower performance. Systems with fewer CPUs will process requests more slowly, especially for models that require more CPU resources than are available.
72
256
73
257
Remember, these are minimum specs, and your experience may vary depending on the specific tasks and workload. Happy node running! 🎉
0 commit comments