Bring your best ideas to the AI Use Case Contest! Enter to win 40 hours of expert engineering support and bring your vision to life using the powerful combination of Alteryx + AI. Learn more now, or go straight to the submission form.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

AMP JOIN fails never completes, non-AMP JOIN finishes in under a minute

I'm running Alteryx Designer 2021.2.1.35394.

 

Unfortunately, I cannot supply samples because the datasets are around 1 GB and contain proprietary information.

 

But I will explain the type of JOIN that is having problems. I'm working with data from network and agent based vulnerability scans and I want to create an "index" field based off of the system name or IP address. I have a NetBIOS Name, a DNS Fully-Qualified Domain Name, and an IP address only all three fields are not necessarily populated. Only IP address is always populated but that isn't a good name.

 

Here is some sample data, but I cannot reproduce the problem with a dataset this small. It's only when I'm running it against hundreds of thousands of lines.

 

NetBIOS NameFQDNIP AddressVulnerability Details
DOMAIN\WINSERVER01winserver01.local.domain10.10.10.10Windows Vulnerability #1
DOMAIN\WINSERVER01winserver01.local.domain10.10.10.10Windows Vulnerability #2
DOMAIN\WINSERVER01winserver01.local.domain10.10.10.10Windows Vulnerability #3
DOMAIN\WINSERVER01winserver01.local.domain10.10.10.10Windows Vulnerability #4
DATABASE01 10.11.10.10Database Vulnerability #1
DATABASE01 10.11.10.10Database Vulnerability #2
DATABASE01 10.11.10.10Database Vulnerability #3
NAS01cifs.local.domain10.15.10.10NAS Vulnerability #1
NAS01cifs.local.domain10.15.10.10NAS Vulnerability #2
NAS01cifs.local.domain10.15.10.10NAS Vulnerability #3
 switch.dmz.domain192.168.10.10Switch Vulnerability #1
 switch.dmz.domain192.168.10.10Switch Vulnerability #2
 switch.dmz.domain192.168.10.10Switch Vulnerability #3
  172.16.30.1Firewall Vulnerability #1
  172.16.30.1Firewall Vulnerability #2
  172.16.30.1Firewall Vulnerability #3

 

The real data also has a dozen more columns.

 

My workflow works like this:

  1. Select Only NetBIOS Name, FQDN, and IP address fields from original data source
  2. Use the Unique Tool to remove all duplicates so that I am only doing this work a single time for each combination
  3. If there is a FQDN, split it into hostname (before first period) and domain (everything after the hostname)
  4. If there is a NetBIOS domain as part of the NetBIOS Name split it into hostname (after the slash) and domain (before the slash)
  5. Created a text sortable IP address by 0-padding the IP address so that each octet is exactly three digits
  6. If there is a FQDN hostname, make it uppercase and set that as the Hostname Index field
  7. If there is not an FQDN hostname, but there is  NetBIOS hostname, make it uppercase and set that as the Hostname Index field
  8. If there is not an FQDN hostname and there not a NetBIOS hostname, then use the sortable IP address as the Hostname Index field
  9. JOIN these new fields back with the original data to add in the Hostname Index, FQDN hostname, FQDN domain, NetBIOS hostname, NetBIOS domain, and sortable IP address

If I use the original Alteryx engine it completes in under a minute. If I use AMP Engine, I've let it run for an hour and it still didn't finish.

The JOIN itself stays at 50% complete.

I've attached a test, but it works fine at this scale.

 

Here is a set of logs using the old Alteryx engine with some identifying information scrubbed.

 

Start: Designer x64: Started running C:\Users\Jasey\Documents\Alteryx\Workflows\Bucket\Test Hostname and Domain Name Parse.yxmd at 06/04/2021 14:11:00
Info: Designer x64: The Designer x64 reported: Allocating requested memory would be more than available physical memory. Reverting to 3240.0 MB of memory.
File_Input: Input Data (1): C:\Users\Jasey\Documents\Alteryx\Workflows\Database.yxdb|3344930 records were read from "C:\Users\Jasey\Documents\Alteryx\Workflows\Database.yxdb"
Info: Unique (15): 16168 unique records and 3328762 duplicates were found
Info: Join (2): 3344930 records were joined with 0 un-joined left records and 0 un-joined right records
Info: Input Data (1): Profile Time: 29,812.03ms, 48.97%
Info: Join (2): Profile Time: 27,010.62ms, 44.37%
Info: Unique (15): Profile Time: 3,187.12ms, 5.24%
Info: Select (14): Profile Time: 730.89ms, 1.20%
Info: Formula (8): Profile Time: 48.00ms, 0.08%
Info: Formula (12): Profile Time: 32.72ms, 0.05%
Info: Formula (13): Profile Time: 15.27ms, 0.03%
Info: Text To Columns (11): Profile Time: 14.54ms, 0.02%
Info: Text To Columns (10): Profile Time: 11.61ms, 0.02%
Info: Select (9): Profile Time: 7.48ms, 0.01%
Info: Select (7): Profile Time: 5.80ms, 0.01%
End: Designer x64: Finished running Test Hostname and Domain Name Parse.yxmd in 1:01 minutes

 

 

For AMP Engine, I canceled it after 15 minutes.

 

Start: Designer x64: Started running C:\Users\Jasey\Documents\Alteryx\Workflows\Bucket\Test Hostname and Domain Name Parse.yxmd at 06/04/2021 14:14:49
Info: Designer x64: The Designer x64 reported: This is AMP Engine; running 8 worker threads; memory limit 4061.0 MB.
Info: Designer x64: The Designer x64 reported: Beginning to compact waiting packets to reduce memory usage
File_Input: Input Data (1): C:\Users\Jasey\Documents\Alteryx\Workflows\Database.yxdb|3344930 records were read from "C:\Users\70067\Documents\Alteryx\Workflows\Database.yxdb"
Info: Unique (15): 16168 unique records and 3328762 duplicates were found
Error: Designer x64: User Canceled

 

The JOIN just stayed at 50% for nearly the entire time.

1 ANTWORT 1
AkimasaKajitani
17 - Castor
17 - Castor

Hi @Jasey_DePriest 

 

I think this problem occurs regarding to AMP Engine.

It seems if the memory is not enough, AMP Engine might be deadlock.

The some similar problems is fixed at version 2021.2(Please see the release note).


So it is recommended to contact the Alteryx Support.

Beschriftungen
Top-Lösungs-Autoren