How to identify languages in Alteryx
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I'm working with a dataset that includes comments in multiple languages. In order to be able to push translation work in the right direction, we need to be able to identify the language the comment is using.
Is there a way to do this using, for example, existing character sets for particular languages, e.g. Japanese or Korean character sets, or particular words for other languages, e.g. Danish, Spanish?
Or is there maybe an API that can be called to check on the language?
Thanks.
- Labels:
- Datasets
- Text Mining
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
https://detectlanguage.com has an API that looks like it does what you want.
I built a quick macro that uses it. You will need to sign up and get an API key to use, but they have a free tier you could start with.
https://www.linkedin.com/in/adriley/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks Adam, I'll give that a try.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
hey @AlexCUK
Theres also a python library:
https://pypi.org/project/langdetect/
I've used it before, if you're comfortable with the python tool this may be a suitable solution.
Cheers,
TheOC
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks, I'll take a look at the python solution too.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
