Questions tagged [text-mining]

Refers to a subset of data mining concerned with extracting information from data in the form of text by recognizing patterns. The goal of text mining is often to classify a given document into one of a number of categories in an automatic way, and to improve this performance dynamically, making it an example of machine learning. One example of this type of text mining are spam filters used for email.

I'm wondering if there exist any models which could take in an ordered list of phrases without punctuation and generate a grammatically correct sentence from it.
For example, for the input: ["My dog"...

Let's say we have a free text containing key-value entities.
Example: "... patient's tumour has width 6 cm and height 5 cm"
Then an expert comes, marks it as important, thus we do have the rule for ...

I am working on a multi-class text classification problem with hierarchical classes structure: super class and sub class for every text example. What am i trying to do is: based on the text predict ...

I am looking for an algorithm that would be able to extract meaningful keyphrases from web articles. Each article has more than 2000 words and information is structured using paragraphs, h1, h2 tags ...

If you have some already preprocessed text that is tagged, what are the rules to extract SVO triplets if you want a triple like (word, word, word). Can you give the sentence as example and extract all ...

Are there any articles about best modern techniques in text classification (not only for English texts, but in general)? In particular, I'm interested:
what kind of text preprocessing techniques are ...

I had used TF-IDF for text similarity but the results were not so good. I tried to implement google universal encoding (tensorflow hub). The results were satisfactory but not upto the mark.
Is there ...

I am trying to generate a table with values parsed from unstructured text. Below are a couple of examples of possibly thousands of entries. For each entry, I would like to identify the title, assign ...

I am trying to extract a skill set of an employee from his/her resume. I have resumes stored as plain text in Database. I do not have predefined skills in this case. How should I approach this problem?...

I cannot really understand the logic behind Hashingvectorizer for text feature extraction. I can follow the logic of Bag of Word or TFiDF where the features are values for all/certain words/N-grams ...

I have one folder named iir, it has 500 txt files. I have another json file named video (with dictionary structure).
I wish to compute: for each of the 500 txt files, find the cosine similarity with ...

I am new to NLP and I would like to ask how can I extract sentences from the text based on keywords that I have using Python. I created a list of keywords which will be used to extract sentences from ...

I have a list of around 700 variables which I need to perform a variable cleanup on. What complicates things is there are different numeric codes which flag an invalid value and these differ by the ...

I'm trying to prototype a system where given a textual query (e.g. a question), I get a list of most relevant documents/questions among a pool of available documents/questions (similar to what we see ...

I have a corpus of text files (free books) which are poorly formatted. The goal is to extract a particular chapter (say chapter 2) from the raw text with all weird formatting removed. Some documents ...

I am looking for references(Papers/github projects) on how to use deep learning in a text extraction task.
Recently I was given a task to extract important information from documents of similar type, ...

I have unstructured data consist of three excel sheet
Text contents of 3 tables some how related to each other But there is no linear direct relationship between them e.g First row of table 1 is not ...

I am new into Artificial Intelligence field and I am working on the classic example of Spam detection using classification. I am using Naive Bayes algorithm as well as SVM.
While working on them it ...

has somebody an idea/approach how to do text mining for a SWOT analysis? (e.g. sentiment analysis) I need to assign categories (Strengths, Weaknesses, Opportunities, Risks) to words in a document and ...

I have somewhat of a general/high level question.
Assume I'm doing supervised machine learning on some text data (tweets for example) and categorizing the documents to a certain taxonomy (multi-class ...

I'm working on some aspcet based sentiment analysis project applied on some tweets or reviews, so far I've applied all the known preprocessing methods on my training dataset which is an XML file that ...

I have 10k records of data, each record represents a unique product(10k class labels) and its description. For example, "Coffee Maker, this product takes coffee beans and brew it, to make tasty cofe". ...

I have extracted ~550 video scripts (subtitles) from 11 free courses on the Coursera platform. I have pre-processed them in terms of punctuation removal, stop words removal, tokenization, stemming and ...