This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
The Python SDK offers a possiblity to automatically install Python packages through pip when installing using the requirements.txt. Some changes to the virtual environment of the tool might not be covered by this: For example, downloading and configuring language models for spaCy cannot be solved through the requirements.txt alone (similar for training corpora for NLTK).
So, as an idea for future versions of the SDK: Allow us to specify a Python script that is run when a tool is installed. This way we might be able to set up the environment, load additional tools etc.
I think this is a good idea, I'd suggest you add it here.
I don't have personal experience with spaCy (although I know @JPKa does), but I believe NLTK checks to see if something has already been downloaded before starting the download. So while not as efficient as your proposal, a reasonable workaround would be to download the training corpora in the plugin script itself. It would take longer to run the workflow the first time the tool is ever used, then be much faster thereafter.
Using the first run of the plugin is definitely an option. But users might think something is wrong, if download and installation takes 5 minutes or so and they don't know what's happening. Currently, I simply deploy the language models along with the plugin in the yxi. This makes the installer quite large, but at least the model is available directly.
Moreover, this would also allow to install packages not available through `pip`.