Confirm SpaCy installation and import PySBD project

Question

This is a continuation of a few posts involving parsing sentences (from a body of text).

The issue is identifying the whole sentence when abbreviations with periods may be present.

I am exploring a combination of lookup tables or dynamic approaches using Python + a relevant library (e.g., NLTK and SpaCy)

Thanks to @danilang I was able to test NLTK using the Python tool, unfortunately stock NLTK had difficulty identifying sentences.

So, I am off trying a new suggested approach that once again is beyond my liberal arts skills.  This time I want to try SpaCy python Sentence Boundary Disambiguation (PySBD) project.  It appears to be better suited than NLTK to handle edge cases scenarios (e.g., U.S., Mass., Co., plc.) .

Two questions...

* I ran the following code in the Python tool to install SpaCy (Alteryx.installPackages("spacy")).   It returned a Call Process Error exactly the same as this post.   I ran the tool again today and it states "Requirement already satisfied." So, how can I confirm SpacY is actually installed correctly?
* How do you install the pySBD project from the Alteryx Python tool?  It does not appear to be included in the vanilla SpaCy installation.  I tried import pysbd, import pysbd.utils, from pysbd.utils import PySDBFactory, import pySBDFactory, etc.  They all generate a ModuleNotFoundError.

4Community_SpaCY_pySBD_entences v2.yxzp

hellyars · Answer

@JessieC   Thanks. I will have to try this out -- but it might take a bit, this isn't exactly my cup of tea.

JessieC · Answer

@hellyars - what version of AYX Designer do you have? See Help > About - Depending on the version, you can check the file paths listed here - https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/How-To-Use-Alteryx-installPackages-in-Python-tool/ta-p/406244 or follow "Procedure: List the Currently Installed Modules"

%ALTERYX%\bin\Miniconda3\PythonTool_venv\Lib\site-packages until 2019.2

%ALTERYX%\bin\Miniconda3\envs\JupyterTool_vEnv\Lib\site-packages for 2019.3.1 to 2021.1.3

%ALTERYX%\bin\Miniconda3\envs\DesignerBaseTools_vEnv\Lib\site-packages for 2021.1.4+

You can install python packages via command prompt (run as admin) - https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/Install-Python-packages-via-command-prompt/ta-p/611877

For 2021.1.4+

Activate DesignerBaseTools_venv:

* cd "C:\Program Files\Alteryx\bin\Miniconda3\Scripts"
* activate DesignerBaseTools_vEnv
* cd "C:\Program Files\Alteryx\bin\Miniconda3\envs\DesignerBaseTools_vEnv\Scripts" 
* pip install pysbd

from ayx import Alteryx
import re
from pandas import DataFrame
import io
from contextlib import redirect_stdout

with io.StringIO() as current_output, redirect_stdout(current_output):

Alteryx.installPackages(package='',install_type='freeze')

packages = ( (item for item in out_row.split("=") if item)
for out_row in re.split(string=current_output.getvalue(),pattern=r"*
") if out_row)

output_df = DataFrame(packages ,columns=["package","version"])

Alteryx.write(output_df,1)