Engine Works

AdamR_AYX · ‎07-29-2020

In my last blog post (Part 1 - Why AMP?) I looked at the reasons why we built a new multi-threaded core engine; in this post we will take a look at some key concepts that make the AMP engine tick. These are not things that you need to know about to use the engine, the engine takes care of it all for you, but more for the technically advanced user who is looking for an understanding of what’s going on under the hood. We will cover the following concepts:

Record Packets
The Memory Allocator
The Task Scheduler

Record Packets

In part 1 of this series we talked about the overhead of making an application multi-threaded and how if we did that on a per row basis then the cost of multi-threading would outweigh the benefits and although we could use more CPU power we would ultimately be slower. The way we tackled this issue in AMP is that tools no longer process data on a record by record basis.

Instead, they process data in record packets.

A record packet is a fixed size allocation of memory (today 4Mbs) which contains a number of records. So if a record is 100k a packet can hold about 4000 records. Tools multi-thread work on a per packet basis which means the cost of multi-threading is spread over all the records in that packet and no longer has a detrimental effect on overall runtimes.

A record packet is always fixed in size and we try to keep packets relatively full to help with performance (Having packets only marginally filled makes for a very inefficient use of memory). Larger fields (think big spatial data or string fields) are held outside the packet. We will cover these in a future post, just know for now that you don’t need to worry about large fields taking up all the space in a packet.

Memory Allocator

Next, to deal with all these record packet memory allocations we have a new component called the Memory Allocator. The allocator’s job is to “allocate” memory for record packets and other large data fields and to manage storage of that memory.

The allocator will ensure that if you have more data than you have set for AMP to use it will first compress the record packets and then write them to disk to keep memory usage under the set limit.

This means that an individual tool does not need to worry about where that memory is stored, it receives a handle to a memory packet and when it wants to read data from that packet, the memory allocator will ensure that data is ready and available in RAM to read and write from.

Task Scheduler

The e1 engine would push records between tools in the main thread of the application, this aspect of the e1 architecture is ultimately what prevents it from being able to effectively use all of your cores on your machine. There are background threads in individual tools which do other work, but importantly only one record can be moving between tools at any given time.

In AMP the actual work of a workflow is co-ordinated by a new component called the Task Scheduler. Tools will produce “tasks” of work, typically on a single packet of data and the scheduler will pull tasks out of a queue and efficiently schedule them across however many cores the user has set up for use by AMP. This is the heart of why AMP can make use of so many cores because a given tool can have multiple tasks all running together in parallel.

Having introduced some foundational concepts of AMP, in our next post in this series we will take a look how we do summarize and join in a massively parallel way.

Engine Works

AMP Engine Technical Deep Dive | Part 2 | Key concepts of the AMP Engine

Record Packets

Memory Allocator

Task Scheduler