I'm running Alteryx Designer 2021.2.1.35394.
Unfortunately, I cannot supply samples because the datasets are around 1 GB and contain proprietary information.
But I will explain the type of JOIN that is having problems. I'm working with data from network and agent based vulnerability scans and I want to create an "index" field based off of the system name or IP address. I have a NetBIOS Name, a DNS Fully-Qualified Domain Name, and an IP address only all three fields are not necessarily populated. Only IP address is always populated but that isn't a good name.
Here is some sample data, but I cannot reproduce the problem with a dataset this small. It's only when I'm running it against hundreds of thousands of lines.
NetBIOS Name | FQDN | IP Address | Vulnerability Details |
DOMAIN\WINSERVER01 | winserver01.local.domain | 10.10.10.10 | Windows Vulnerability #1 |
DOMAIN\WINSERVER01 | winserver01.local.domain | 10.10.10.10 | Windows Vulnerability #2 |
DOMAIN\WINSERVER01 | winserver01.local.domain | 10.10.10.10 | Windows Vulnerability #3 |
DOMAIN\WINSERVER01 | winserver01.local.domain | 10.10.10.10 | Windows Vulnerability #4 |
DATABASE01 | 10.11.10.10 | Database Vulnerability #1 | |
DATABASE01 | 10.11.10.10 | Database Vulnerability #2 | |
DATABASE01 | 10.11.10.10 | Database Vulnerability #3 | |
NAS01 | cifs.local.domain | 10.15.10.10 | NAS Vulnerability #1 |
NAS01 | cifs.local.domain | 10.15.10.10 | NAS Vulnerability #2 |
NAS01 | cifs.local.domain | 10.15.10.10 | NAS Vulnerability #3 |
switch.dmz.domain | 192.168.10.10 | Switch Vulnerability #1 | |
switch.dmz.domain | 192.168.10.10 | Switch Vulnerability #2 | |
switch.dmz.domain | 192.168.10.10 | Switch Vulnerability #3 | |
172.16.30.1 | Firewall Vulnerability #1 | ||
172.16.30.1 | Firewall Vulnerability #2 | ||
172.16.30.1 | Firewall Vulnerability #3 |
The real data also has a dozen more columns.
My workflow works like this:
If I use the original Alteryx engine it completes in under a minute. If I use AMP Engine, I've let it run for an hour and it still didn't finish.
The JOIN itself stays at 50% complete.
I've attached a test, but it works fine at this scale.
Here is a set of logs using the old Alteryx engine with some identifying information scrubbed.
Start: Designer x64: Started running C:\Users\Jasey\Documents\Alteryx\Workflows\Bucket\Test Hostname and Domain Name Parse.yxmd at 06/04/2021 14:11:00
Info: Designer x64: The Designer x64 reported: Allocating requested memory would be more than available physical memory. Reverting to 3240.0 MB of memory.
File_Input: Input Data (1): C:\Users\Jasey\Documents\Alteryx\Workflows\Database.yxdb|3344930 records were read from "C:\Users\Jasey\Documents\Alteryx\Workflows\Database.yxdb"
Info: Unique (15): 16168 unique records and 3328762 duplicates were found
Info: Join (2): 3344930 records were joined with 0 un-joined left records and 0 un-joined right records
Info: Input Data (1): Profile Time: 29,812.03ms, 48.97%
Info: Join (2): Profile Time: 27,010.62ms, 44.37%
Info: Unique (15): Profile Time: 3,187.12ms, 5.24%
Info: Select (14): Profile Time: 730.89ms, 1.20%
Info: Formula (8): Profile Time: 48.00ms, 0.08%
Info: Formula (12): Profile Time: 32.72ms, 0.05%
Info: Formula (13): Profile Time: 15.27ms, 0.03%
Info: Text To Columns (11): Profile Time: 14.54ms, 0.02%
Info: Text To Columns (10): Profile Time: 11.61ms, 0.02%
Info: Select (9): Profile Time: 7.48ms, 0.01%
Info: Select (7): Profile Time: 5.80ms, 0.01%
End: Designer x64: Finished running Test Hostname and Domain Name Parse.yxmd in 1:01 minutes
For AMP Engine, I canceled it after 15 minutes.
Start: Designer x64: Started running C:\Users\Jasey\Documents\Alteryx\Workflows\Bucket\Test Hostname and Domain Name Parse.yxmd at 06/04/2021 14:14:49
Info: Designer x64: The Designer x64 reported: This is AMP Engine; running 8 worker threads; memory limit 4061.0 MB.
Info: Designer x64: The Designer x64 reported: Beginning to compact waiting packets to reduce memory usage
File_Input: Input Data (1): C:\Users\Jasey\Documents\Alteryx\Workflows\Database.yxdb|3344930 records were read from "C:\Users\70067\Documents\Alteryx\Workflows\Database.yxdb"
Info: Unique (15): 16168 unique records and 3328762 duplicates were found
Error: Designer x64: User Canceled
The JOIN just stayed at 50% for nearly the entire time.
I think this problem occurs regarding to AMP Engine.
It seems if the memory is not enough, AMP Engine might be deadlock.
The some similar problems is fixed at version 2021.2(Please see the release note).
So it is recommended to contact the Alteryx Support.
Benutzer | Anzahl |
---|---|
76 | |
58 | |
53 | |
47 | |
38 |