Content Analysis and Text Mining

A highly advanced content analysis and text-mining software with unmatched handling and analysis capabilities

“For those who have ever needed to find themes or relationships in verbatim responses, focus group transcripts, or other text sources, WordStat is very attractive indeed.” Marketing Research, Spring 2006

WordStat is a flexible and easy-to-use text analysis software – whether you need text mining tools for fast extraction of themes and trends, or careful and precise measurement with state-of-the-art quantitative content analysis tools. WordStat‘s seamless integration with Simstat – our statistical data analysis tool – and QDA Miner – our qualitative data analysis software – gives you unprecedented flexibility for analyzing text and relating its content to structured information, including numerical and categorical data.

What it is used for?

WordStat can be used by anyone who needs to quickly extract and analyze information from large amounts of documents. Our content analysis and text mining software is used for:

• Content analysis of open-ended responses, interview or focus group transcripts
• Business intelligence and competitive web sites analysis
• Information extraction and knowledge discovery from incident reports, customer complaints
• Content analysis of news coverage or scientific literature
• Automatic tagging and classification of documents
• Fraud detection, authorship attribution, patent analysis
• Taxonomy development and validation

Key and Unique features

Content Analysis Tools Powerful content analysis and text mining software for handling large amounts of unstructured information. WordStat can process up to 20 million words per minute and identify all references to user-defined concepts using categorization dictionaries.
Text Mining and Visualization Tools Integrated exploratory text mining and visualization tools such as clustering, multidimensional scaling, proximity plots, and more, to quickly extract themes and automatically identify patterns.
Unstructured text with structured data Relates unstructured text with structured data such as dates, numbers or categorical data for identifying temporal trends or differences between subgroups or for assessing relationship with ratings or other kinds of categorical or numerical data.
hierarchical content analysis dictionaries Use existing or create your own hierarchical content analysis dictionaries or taxonomies composed of words, word patterns, phrases as well as proximity rules (such as NEAR, AFTER, BEFORE) for achieving precise measurement of concepts.
Computer assistance for dictionary building Truly unique computer assistance for dictionary building with tools for extracting common phrases and technical terms and for quickly identifying in your text collection, misspellings, synonyms, antonyms and related words.
keyword-in-context and keyword retrieval tools One click access to keyword-in-context and keyword retrieval tools for easy identification and coding of relevant text segments, validation of content analysis dictionaries, word-sense disambiguation or for drilling down to the source documents.
qualitative coding tool Seamless integration with a state of the art qualitative coding tool (QDA Miner), allows more precise exploration of data or more in-depth analysis of specific documents or extracted text segments when needed.
Machine Learning for automatic document classification Machine Learning for automatic document classification using Naive Bayes and K-Nearest Neighbours algorithms with automatic features selection and validation tools. Classification models may then be saved on disk and reapplied on new data.
Importation and exportation of database Easy importation of databases, spreadsheets and documents (including PDF and HTML)  as well as exportation of text analysis results to common industry file formats (Excel, SPSS, ASCII, HTML, XML, MS Word) and graphs (PNG, BMP and JPEG).



WordStat is a content analysis and text mining add-on module of QDA Miner. This powerful software can:

  • Analyze quickly a large amount of unstructured data such as customer feedback, emails, open-ended responses, interview transcripts, incident reports, patents, legal documents, blogs or websites.
  • Build a content analysis dictionary in order to categorize automatically text data and quickly retrieve text segments related to a specific category (for example, retrieve positive and negative comments). You can apply statistical analysis on categories or explore the relationship between categories and other variables associated with the documents (ex. authors, location, time, etc.) in order to identify trends. To save you time, you can reapply in few seconds the same dictionary for a similar project, customize or use an existing dictionary.
  • Provide statistical and visualization tools that are easy to interpret such as word frequencies, clustering, correspondence analysis or heatmap. WordStat can also compute statistical tests to verify the strength of the analysis. All those features allow you to quickly identify themes, trends and patterns without the need to read the documents and to explore the relationship between the content of documents and other categorical or numerical variables such as gender, age, education level, etc.
  • Transform text into statistical tables and graphics and at anytime, you can drill down to the source documents in order to see what is behind the numbers.
  • Provide a total control over the content analysis process and enough closeness to the data to achieve the perfect balance between text analysis efficiency and precision in the results.
  • Easily create outstanding presentations and write a professional report that includes statistical tables and graphics provided by WordStat such as bar charts, pie charts, bubble charts, dendrogram, concept maps, correspondence analysis and more.
  • Analyse text data in almost any languages, because the software relies on language independent techniques.
  • No need to identify and fix manually spelling mistakes, WordStat can automatically correct them in your documents, and you will therefore save a lot of time. In order to standardized the writing of similar words and phrases and therefore, obtaining more accurate statistical results, WordStat allows you to substitute any specific word or phrase with another one of your own choice.

Reviews of WordStat

RESEARCH, September 2008
THE POLITICAL METHODOLOGIST, vol 15 (1), Summer 2007

OR/MS Today, October 2005
RESEARCH, August 2005
LINGUIST, April 2004
FIELD METHODS, vol 11(2), 1999

Which base module?

WordStat is a module that must be run from either of the following base products:

QDA Miner - The text management and qualitative analysis program allows one to create and edit data files, import documents, and perform manual coding of those documents. Several analysis tools are also available to look at the frequency of manually assigned codes and the relationship between those codes and other categorical or numeric variables. One of the features that has become popular is the "Query by Example" that will retrieve text similar to a starting example and can "learn" based on relevance feedback from the user. When used with QDA Miner, WordStat can perform content analysis on whole documents or selected segments of those documents tagged with specific user defined codes.

Simstat -This statistical software provides a wide range of statistical procedures for the analysis of quantitative data. It offers advanced data file management tools such as the ability to merge data files, aggregate cases, perform complex computation of new variables and transformation of existing ones. When used with Simstat, WordStat can analyze textual information stored in any alphanumeric, plain text and rich text memo variable (or field). It includes various tools to explore the relationship between any numeric variable of a data file and the content of alphanumeric ones.

Which should I use? - If you are primarily working with just textual data then QDA Miner provides the most powerful text manipulation and organization tools. QDA Miner is document oriented and has some unique tools to handle documents, perform searches on those documents, and tag them.

If you need to also analyze numerical data associated with your textual data then Simstat provides a wide range of statistical analyses. It offers advanced statistical routines like multiple regression, multi-way anova/ancova, factor analysis, reliability analysis, etc.

If you need to do both then you can get both Simstat and QDA Miner along with Wordstat in the specially-priced Prosuite package. They easily coexist and work with Wordstat, giving you the widest range of options.


You can download a demo version of the latest Wordstat from here.

Prices and ordering

For prices, on-line ordering and other purchasing information please go to ourr ordering page.

Buy now

System Requirements

  • Operating System: Microsoft Windows XP, 2000, Vista, Windows 7, 8 and 10
  • Memory: From 256 MB (XP) to 1GB (Vista, Windows 7, 8 and 10)
  • Disk Space: 40 MB of disk space.
  • Either QDA Miner or Simstat for Windows version 1.21d or later

About KCS

Kovach Computing Services (KCS) was founded in 1993 by Dr. Warren Kovach. The company specializes in the development and marketing of inexpensive and easy-to-use statistical software for scientists, as well as in data analysis consulting.

Mailing list Join our mailing list

Home | Order | MVSP | Oriana | XLStat
QDA Miner | Accent Composer | Stats Books
Stats Links | Anglesey

Share: FacebookFacebook TwitterTwitter RedditReddit
Del.icio.usDel.icio.us Stumble UponStumble Upon


Like us on Facebook

Get in Touch

  • Email:
  • Address:
    85 Nant y Felin
    Pentraeth, Isle of Anglesey
    LL75 8UY
    United Kingdom
  • Phone:
    (UK): 01248-450414
    (Intl.): +44-1248-450414