Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Post Brexit UK VAT ID Validation

mahadevaswab
8 - Asteroid

Dear All,

 

Do we have any workflows to validate the VAT ID pertains to UK using below website.

 

https://www.tax.service.gov.uk/check-vat-number/known

 

as this was previously used to validate the status of VAT ID in VIES web portal.

 

Regards

Mahadeva Swamy

22 REPLIES 22
BrandonB
Alteryx
Alteryx

An API based approach is always preferable, but I was able to leverage Alteryx + Python + Selenium to remote control my web browser, load the value, hit enter, and scrape the resulting page. I took my inspiration from this page: https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Python-Code-Tool-Web-Scraping-Dynamic-... 

 

Here is my Python script that I used for the process:

 

# Import necessary requirements and install if does not exist
from ayx import Package
Package.installPackages(['pandas','numpy','selenium', 'urllib3'])
from ayx import Alteryx
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
from time import sleep
import pandas

#Read in VAT Number
VATdf = Alteryx.read("#1")
VAT = str(VATdf.iloc[0, 0])

# Start the WebDriver and load the page
wd = webdriver.Chrome("C:/Program Files/Alteryx/bin/Plugins/chromedriver")
wd.get("https://www.tax.service.gov.uk/check-vat-number/enter-vat-details")

# VAT variable to be sent in to target element
#VAT = "GB243510593"

#Identify target element, send VAT, and hit return
elem = wd.find_element_by_id("target")
elem.send_keys(VAT)
elem.send_keys(Keys.RETURN)

# let page load for a few seconds before scraping
time.sleep(5)

# And grab the page whole HTML source
html_page = wd.page_source

# Attempt to close chromium instance
wd.quit()

# Turn the variabe with html page into Pandas' DF
df = pandas.DataFrame({"html_page":[html_page]})

# Write the data frame to Alteryx workflow for downstream processing
Alteryx.write(df,1)

 

Dynamic VAT Lookup.png

 

You can either turn this workflow into a macro where you add a control parameter and action tool to the text input and a macro output at the end, or you could modify the python scrip to loop through every value to output a row to then be parsed. Hopefully this is a helpful start! I have attached my workflow to this post. 

mahadevaswab
8 - Asteroid

Dear @BrandonB ,

 

thanks for the time and effort on this.

However, when I check in my C drive I did not fine "chromedriver", can you please advise where do I download the same in order to run the scrip given in the workflow. 

BrandonB
Alteryx
Alteryx

Correct, that’s because it is an additional download found in the first link. You will need the correct chrome driver for whatever version of chrome that you have. http://chromedriver.chromium.org/downloads

You will want to put the chromedriver.exe file in the C drive location mentioned in the script so that it can be referenced. 

mahadevaswab
8 - Asteroid

I am getting the below error when I run the above attached Workflow

 

 

Error: Python (2): ---------------------------------------------------------------------------
CalledProcessError Traceback (most recent call last)
<ipython-input-1-1634d9cc6e05> in <module>
1 # Import necessary requirements and install if does not exist
2 from ayx import Package
----> 3 Package.installPackages(['pandas','numpy','selenium', 'urllib3'])
4 from ayx import Alteryx
5 from selenium import webdriver
c:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\lib\site-packages\ayx\Package.py in installPackages(package, install_type, debug)
200 print(pip_install_result["msg"])
201 if not pip_install_result["success"]:
--> 202 raise pip_install_result["err"]
c:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\lib\site-packages\ayx\Utils.py in runSubprocess(args_list, debug)
118
119 try:
--> 120 result = subprocess.check_output(args_list, stderr=subprocess.STDOUT)
121 if debug:
122 print("[Subprocess success!]")
c:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\lib\subprocess.py in check_output(timeout, *popenargs, **kwargs)
409 kwargs['input'] = '' if kwargs.get('universal_newlines', False) else b''
410
--> 411 return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
412 **kwargs).stdout
413
c:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\lib\subprocess.py in run(input, capture_output, timeout, check, *popenargs, **kwargs)
510 retcode = process.poll()
511 if check and retcode:
--> 512 raise CalledProcessError(retcode, process.args,
513 output=stdout, stderr=stderr)
514 return CompletedProcess(process.args, retcode, stdout, stderr)
CalledProcessError: Command '['c:\\program files\\alteryx\\bin\\miniconda3\\envs\\jupytertool_venv\\python.exe', '-I', '-m', 'pip', 'install', 'pandas', 'numpy', 'selenium', 'urllib3']' returned non-zero exit status 1.

 

 

BrandonB
Alteryx
Alteryx

This Python script uses additional packages so Alteryx must be launched as administrator the first time and then run this workflow so that it can install the required packages. Then the workflow will work as expected

mahadevaswab
8 - Asteroid

@BrandonB thanks for the help, however what do you mean by run as administrator, since this License is procured and given by our company and we are also using the same as administrator only, is there any specific changes required while launching this. please help on this.

 

Regards

Mahadev

BrandonB
Alteryx
Alteryx

You will want to hold down the shift button and right click the Alteryx Designer icon and then click “Run as administrator”

mahadevaswab
8 - Asteroid

Hi @BrandonB , I did as advised but still getting the below error.

 

Error: Python (2): ---------------------------------------------------------------------------
PermissionError Traceback (most recent call last)
c:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\lib\site-packages\selenium\webdriver\common\service.py in start(self)
71 cmd.extend(self.command_line_args())
---> 72 self.process = subprocess.Popen(cmd, env=self.env,
73 close_fds=platform.system() != 'Windows',
c:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\lib\subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors, text)
853
--> 854 self._execute_child(args, executable, preexec_fn, close_fds,
855 pass_fds, cwd, env,
c:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\lib\subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, unused_restore_signals, unused_start_new_session)
1306 try:
-> 1307 hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
1308 # no special security
PermissionError: [WinError 5] Access is denied

During handling of the above exception, another exception occurred:
WebDriverException Traceback (most recent call last)
<ipython-input-2-07b44840375d> in <module>
1 # Start the WebDriver and load the page
----> 2 wd = webdriver.Chrome("C:/Program Files/Alteryx/bin/Plugins/chromedriver")
3 wd.get("https://www.tax.service.gov.uk/check-vat-number/enter-vat-details")
4
5 # VAT variable to be sent in to target element
c:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\lib\site-packages\selenium\webdriver\chrome\webdriver.py in __init__(self, executable_path, port, options, service_args, desired_capabilities, service_log_path, chrome_options, keep_alive)
71 service_args=service_args,
72 log_path=service_log_path)
---> 73 self.service.start()
74
75 try:
c:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\lib\site-packages\selenium\webdriver\common\service.py in start(self)
84 )
85 elif err.errno == errno.EACCES:
---> 86 raise WebDriverException(
87 "'%s' executable may have wrong permissions. %s" % (
88 os.path.basename(self.path), self.start_error_message)
WebDriverException: Message: 'chromedriver' executable may have wrong permissions. Please see https://sites.google.com/a/chromium.org/chromedriver/home

 

Kingazawora
8 - Asteroid

Hi All

Do you have maybe the workflow in Alteryx?

Thanks, Kinga

Labels
Top Solution Authors