Skip to content

Advanced counter arrays #221

@cardillan

Description

@cardillan

The new logic ability to access variables indirectly by name (even in @this processor) allows various arrangements of internal arrays.

We categorize the internal array mechanism into two classes:

  • Regular arrays: the jump tables are generated separately for read and writes, and they use an access variable to read/write a value from/to an array element. This is the usual @counter array supported by Mindustry Logic since version 6.
  • Compact arrays: there is just one jump table. The jump table stores the name of the variable corresponding to the array element to the access variable. A separate read or write instruction is needed to read/write the element value from either a local, or a remote processor. If the array element variables aren't accessed directly, they need to be created using draw triangle instructions. This mechanism requires Mindustry BE.

Compact arrays may offer an advantage when elements are always read and written at the same index (e.g. array[index]++). Here, the compact array requires just one variable name lookup, making this actually faster than a regular array.

Single processor

The size of arrays in a single processor setting is limited by available instruction space. Compact arrays can be twice as large given the same space.

The decision to use compact versus regular arrays on a single processor will be made by the compiler. The arrays will be created as compact arrays always. The optimizer will generate additional regular jump tables, or replace compact table with regular jump tables, using the optimization for speed mechanism.

On top of that, both compact and regular array access can be inlined.

A support for declaring compact arrays might be added (a new compact modifier keyword). These arrays won't be expanded to regular arrays.

Remote processors

The various configurations here are described from the point of view of a local processor, which uses a remote processor to store either the variables, or some portions of the access code, or both.

Some schemes might allow the remote processor to use the array for its own processing as well, but this is not part of this evaluation. The remote processor here serves only for providing access to the array to the local processor and doesn't contain other code.

Since the code is spread over multiple processors and potentially compiled separately, a concrete configuration needs to be specified by the programmer - it cannot be altered by the optimizer.

Remote storage, local access code

The only scenario where this arrangement brings some benefits for the local processor is when using a compact array. In this case, the variables need not be created in the local processor, only in the remote processor, saving the space needed for draw triangle instructions.

Pros:

  • Array access doesn't require synchronization.
  • Random access as fast as local compact arrays.
  • Saves some instruction space in the local processor (specifically, one twelfth of the array jump table).

Cons:

  • Direct element access is converted to a read/write operation instead of local variable.

When compact arrays get implemented in Mindcode, this functionality will be provided just by declaring a remote array.

Local storage, remote access code: compact array

Supports up to 500 array elements - largest continuous internal array possible.

Pros:

  • Local storage means direct element access can be optimized down to actual variable.
  • Array access served in a single tick even on microprocessor.

Cons:

  • Requires synchronization; one element access takes one tick on average (however, read-update-write sequence still requires just one remote call).

Conclusion: should be supported.

Implementation

Remote processor: links the local processor as processor1. Code:

write "array0" processor1 "array"
wait 1e15 # or stop, both should work
write "array1" processor1 "array"
wait 1e15 
write "array2" processor1 "array"
wait 1e15 
...
write "array499" processor1 "array"
wait 1e15     # last (1000th) instruction  

Local processor: links the remote processor as processor1. Creates variables array0 to array499 using draw triangle.

Access to array element at position index:

op mul address index 2
write address processor1 `@counter`
wait 1e-12
# to read
read variable @this array
# to write
write value @this array
....
draw triangle array0 array1 array2 array3 array4 array5
...

The local processor uses wait to wait for the next tick, during which the array variable will be set even if remote processor is a microprocessor. If the local processor accesses the array element once per tick on average, and in bursts of at most 4 accesses in a single tick, the average IPT of the local processor should not suffer.

It might break down at FPS higher than 60, though. To make sure it won't break, the array access would have to be modified to this:

op mul address index 2
set array null
write address processor1 `@counter`
loop:
wait 1e-12
jump loop strictEqual array null
# to read
read variable @this array
# to write
write value @this array

Another possibility might be to specify a time in wait precisely equal to one tick (or maybe a wee bit shorter). Will need some experiments to confirm.

Local storage, remote access code: regular array

Supports up to 166 array elements.

For regular arrays, the remote processor needs to read/write the value from/to proper access variable in local storage.

Pros:

  • Local storage means direct element access can be optimized down to actual variable.
  • Reads may be one instruction faster than compact arrays, depending on context.

Cons:

  • Writes are not faster than writes in compact arrays, perhaps might even be slower sometimes.
  • Supports only 1/3 of elements compared to a compact array.
  • Can't be proofed against FPS > 60 (or maybe can using the tick-long wait instruction).
  • Array access requires three instructions: microprocessor is not fast enough.

Conclusion: won't be supported.

Implementation

Element #0 read, followed by element #0 write - remote code:

read value processor1 "array0"
write value processor1 "arrayRead"
wait 1e15
read value processor1 "arrayWrite"
write value processor1 "array0"
wait 1e15

Read at index - local code:

op mul address index 6
write address processor1 `@counter`
wait 1e-12
print arrayRead

Write at index - local code:

set arrayWrite value
op mul address index 6
op add address address 3
write address processor1 `@counter`
wait 1e-12

Remote storage, remote access code: compact array

Supports up to 461 elements.

This scheme is a variation on Local storage, remote access code: compact array.

Local code is modified to read array element from the remote processor instead of @this. Remote code must create instructions using draw triangle, reducing maximum array size to 461 elements.

Pros:

  • Array access served in a single tick even on microprocessor.
  • Frees up to 77 draw triangle instructions on the local processor

Cons:

  • Requires synchronization; one element access takes one tick on average (however, read-update-write sequence still requires just one remote call).
  • Direct element access is converted to a read/write operation instead of local variable.

Conclusion: should be supported if the implementation proves to be easy (I expect it will).

Remote storage, remote access code: regular array

Supports up to 250 array elements.

Array elements are referenced directly by jump tables: no need for draw triangle initialization.

Pros:

  • Reads may be one instruction faster than compact arrays, depending on context.
  • Array access served in a single tick even on microprocessor.

Cons:

  • Writes are not faster than writes in compact arrays, perhaps might even be slower sometimes.
  • Read-update-write can't be optimized.
  • Can't be proofed against FPS > 60.
  • Direct element access is converted to a read/write operation instead of local variable.

Conclusion: compared to compact array, there are practically only downsides. Won't be supported.

Implementation

Element #0 read, followed by element #0 write - remote code:

write array0 processor1 "arrayRead"
wait 1e15
read array0 processor1 "arrayWrite"
wait 1e15

Read at index - local code:

op mul address index 4
write address processor1 `@counter`
wait 1e-12
print arrayRead

Write at index - local code:

set arrayWrite value
op mul address index 4
op add address address 2
write address processor1 `@counter`
wait 1e-12

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions