IT Pro is supported by its audience. When you purchase through links on our site, we may earn an affiliate commission. Learn more

What is text mining?

Your business can gain valuable insights by letting AI analytics loose on your emails and documents

Analytics, the process of deriving information from raw data, is an important practice that businesses have tried to master for as long as such data has been available. In a short space of time, it's evolved from a fairly basic concept to an advanced practice incorporating technologies like machine learning and artificial intelligence (AI).

Text mining is one such evolution, which takes the basic idea of deriving information from data and applying this to vast volumes of documents, letters, emails and written material. As with other conventional analytics, the aim of text mining is to convert raw data into meaningful information which can then be used to support other processes.

Text mining has only really been possible thanks to the advent of AI and more specialist technology like natural language processing (NLP), given that to produce effective results you need to trawl through vast quantities of data at pace. If deployed correctly, text mining has the potential to open new insights for your organisation.

How does text mining work?

Before your organisation can take advantage of text mining, any text-based data needs to be structured – in other words, text mining is a secondary process. For example, data contained in streams of uncategorised documents is considered unstructured.

To give this kind of data structure, businesses often deploy relational databases, where the data is organised based on connections between stored items. This would need to involve a variety of processes, such as parsing of text or pattern analysis, before it is considered structured. Once in this form, the data can be translated to something more visually appealing, such as charts, maps and tables.

Unleashing the power of natural language processing is one key method used to structure raw text-based big data. This technology uses data in combination with algorithms to add context to the way machines try to understand spoken language. It essentially tries to replicate the process by which a human might read text, and often serves to understand and define potentially vague words, like 'bow' for example. It's also embedded in most AI-powered virtual assistants, like Apple's Siri or Microsoft's Alexa.

NLP is deployed as part of this process to churn through reams of documentation in a way that would otherwise be too costly and time-consuming for any human, identifying the most relevant and important nuggets of information, based on any particular request.

One important branch of text mining is sentiment analysis, which involves combing through vast quantities of documentation to summarise how certain groups of people, either customers or employees, feel towards a certain issue. This could be used to learn how customers feel toward a brand, such as using text mining on web forums, or can be used to assess worker morale by subjecting internal emails to analysis.

Relationships, patterns and key facts are isolated and then turned into structured data so that AI can conduct further analysis on the data and identify insights based on what was demanded in the first place.

Benefits of sentiment analysis

Once assorted into a more structured format, the data can then be exposed to algorithms designed to give businesses high-quality insights that were impossible to glean through human-led analysis.

Sentiment analysis is one key application of text mining that can give businesses the exact thoughts and feelings about a company, or a particular aspect of a company. The insights could range from customer attitudes towards a brand to the morale of employees within the organisation.

In the former example, the text absorbed into the text mining process might come from online reviews, social media, customer interactions via email, as well as call centre interactions. These can be turned into data points to identify patterns that point to common threads in the way people perceive a certain brand. The information can then be presented in such a way as to devise strategies to solve negative branding and improve standards and practices.

This form of data analytics can also be applied within an organisation to monitor the way that workers interact with each other through workspace applications like Slack or Microsoft Teams, as well as email. This is so that an organisation can determine how employees are feeling towards the leadership, for instance, and use this information to find ways to boost morale or build trust in areas where it may be lacking.

The Enron effect

A now infamous scandal of the early noughties provides a useful case study for demonstrating the power of text mining. Almost a decade after the bankruptcy of energy firm Enron, a text-mining firm, KeenCorp, has managed to sift through troves of emails dating back to the days of the scandal

The emails in question held correspondence between 150 of the company’s executives, essentially chronicling the downfall of the company. KeepCorp was able to make sense of the vast trove by passing it through an algorithm tailored to track company morale.

By tracking changes in the tone of the messages, KeepCorp’s algorithm was able to pinpoint the exact date when communications started to turn sour; 28 June 1999. This also turned out to be the date that the company’s board had discussed ‘LJM’, a proposal to hide the company’s struggling finances. This is considered to be one of the key moments of Enron’s downfall.

This is just one example of how text mining is able to make sense of enormous volumes of data that may otherwise obfuscate important information.

Featured Resources

Accelerating AI modernisation with data infrastructure

Generate business value from your AI initiatives

Free Download

Recommendations for managing AI risks

Integrate your external AI tool findings into your broader security programs

Free Download

Modernise your legacy databases in the cloud

An introduction to cloud databases

Free Download

Powering through to innovation

IT agility drive digital transformation

Free Download


What is big data analytics?
Business strategy

What is big data analytics?

8 Jun 2022

Most Popular

Salaries for the least popular programming languages surge as much as 44%

Salaries for the least popular programming languages surge as much as 44%

23 Jun 2022
The UK's best cities for tech workers in 2022
Business strategy

The UK's best cities for tech workers in 2022

24 Jun 2022
LockBit 2.0 ransomware disguised as PDFs distributed in email attacks

LockBit 2.0 ransomware disguised as PDFs distributed in email attacks

27 Jun 2022