NLP with R and Python

Daniel Bendel
2 min readFeb 12, 2021

Using sentence_transformers and transformers NLP power in R.

The picture’s source can be found here.

During my work at eyeo I had to perform natural language processing.

While I am familiar with R, using the huggingface model base with Python was quite a challenge.

Therefore, I created a GitLab repository that presents how to do NLP with R using Python in the background. Hope you find it helpful :)

The idea is to make the state-of-the-art neural network models for natural language processing also available for those that are used to work with R.

In principle, you can send our text data from R to Python. Python will then run the (automatically) downloaded NLP models to perform sentiment and semantic similarity analysis.

Then the results are sent back to R, where you can perform data analysis like clustering of similar text, ranking the given sentences based on their centrality (with Google’s pagerank algorithm), or anything else you can think of.

For sentiment analysis, I can suggest using the BERT uncased multilingual model from NLPtown. For semantic similarity analysis, I can recommend the large ROBERTA model from Facebook as well as Google’s large BERT model.

Feel free to push your analysis into the repo so I can share a collection of the code’s applications with everyone.

The repository also contains a step-by-step guide on how to install everything necessary (like Anaconda and necessary Python packages) to make your start easier.

--

--

Daniel Bendel
0 Followers

Data Scientist - mainly with R and some Python