Sign in

Photo by Tirza van Dijk on Unsplash

Helping data analysts create effective dashboards is a key task for User Experience (UX) designers. Incorporating multiple components into a unified report involves more than just a general knowledge of the user interface (UI) and UX. Even for seasoned UX designers, a great amount of effort is spent on distilling large amounts of complex information into a simple, clear storytelling report. As PolyAnalyst software UX designers, we work hard to incorporate web reporting features that enable users to create meaningful and interactive dashboards.

Recently, we received feedback from our users asking us for some tips on how to beautify their…

Many types of models are created from individual samples of data. We may have conducted a study, collected vital signs from different patients, and are now fitting some predictive model. These patients are unique and independent cases, which collectively make up a trend. However, many kinds of systems are not collections of individual data points; they are a sequence of observations of one concept over time. The fluctuations in the stock market, global temperatures, oil sales, activity of solar flares, and so on are not baskets of different observations, but rather they are sequences of the same object through time…

A simple model with complex applications

An cartoon images demonstrating linear regression.
An cartoon images demonstrating linear regression.

Before we get into talking about linear regression, you may recall that we recently have discussed advanced machine learning techniques such as neural networks and support vector machines, but these are not always the most appropriate tool for modeling data. Machine Learning models are big, complicated, and almost impossible to interpret. While they have great capacity and are sometimes the only solution to difficult problems, their downsides can be substantial depending on what our goals are. Additionally, it is often the case that we want to understand our models or have some measures of how valid they are.

For example…

Behind the scenes in NLP

A word cloud generated by PolyAnalyst.
A word cloud generated by PolyAnalyst.
A word cloud produced in PolyAnalyst.

If you ask five different people, “What does a Text Analysis tool do?”, it is very likely you will get five different responses. The term Text Analysis is used to cover a broad range of tasks that include identifying important information in text: from a low, structural level to more complicated, high-level concepts. Included in this very broad category are also tools that convert audio to text and perform Optical Character Recognition (OCR); however, the focus of these tools is on the input, rather than the core tasks of text analysis.

Text Analysis tools not only perform different tasks, but…

Seeing the random forest through the decision trees... Or should we be using neural networks?

Machine learning approaches to modeling are just that — approaches. There are many forms of models we could use with machine learning, each with different design philosophies and quirks. When setting out to use machine learning to create your own models, a question you may be asking yourself is: Which model framework do I choose? From Neural Networks, to Support Vector Machines, to Decision Trees, to Linear Regression, there are many options. While there is rarely a definitive or clear answer to this question, let’s discuss some things to consider when making this choice.

Model Properties

Model frameworks have a set of…

The power of big data

There are many reasons for the explosion of machine learning advancements over the past decade. We now have vastly improved hardware for fast computation, and memory is cheaper than ever. Data is now “Big Data,” and it is both jealously hoarded and publicly available in repositories such as ImageNet. Individually, these advancements are already a blessing for the technology-space. But for artificial intelligence (AI), they have opened the gates for something truly powerful-Neural Networks.

A visualization of an artificial neuron
A visualization of an artificial neuron

Neural Whatnow?

Neural Networks. You’ve probably heard of them. They are at the forefront of the machine learning craze and are the driver of many of the most…

Which one to choose?

Image by Megaputer

As our world becomes increasingly global, so does our data. Being an analyst today often means working with text data that contains multiple languages. So what do you do?

Essentially, there are two options we may consider: machine translation or native language analysis.

  1. With machine translation, we actually create a new dataset where the text has all been translated into a single language before we do the analysis. This makes the subsequent analysis much easier, as we only need to use a single language grammar module for the analysis.
  2. Native language analysis means that we keep…

How data analysis can promote more intelligence business decisions.

In 2018, US customers spent over $500 billion on online shopping. Often, customers write reviews to describe their experience and sentiments about the products they purchased. Since the reviews are publicly available to the world, they naturally play an important role in influencing potential customers’ purchase decisions, steering them towards or away from buying certain products. Online product reviews represent a unique source of unbiased information: customers are sharing their opinions with other potential customers, bypassing the manufacturer.

Inquisitive product manufacturers can gain invaluable insights from the analysis of online customer…

A brief introduction to working with text data

Data cleansing is critical in data analysis. The quality of data cleansing has a direct impact on the accuracy of the derived models and conclusions. In practice, data cleaning typically accounts for between 50% and 80% of the analysis process.

Traditional data cleansing methods are mainly used to process structured data, including the completion of missing data, modification of format and content errors, and removal of unwanted data. Resources on these methods are widely available. For example, big data engineer Kin Lim Lee published an article on this topic. Lee’s article introduced…

A robot and a go board.
A robot and a go board.

Machine Learning is hot right now. Really hot. And it’s come a long way from where it was over two decades ago. In 1996, IBM’s Deep Blue defeated world chess champion Garry Kasparov. This was a great achievement to mark the progress of the field, but chess is a relatively simple task and computers still struggled to be able to master more difficult tasks for another decade. Then, in the late aughts, machine learning started to boom like never before. In 2011, IMB’s Watson utilized live simultaneous natural language processing with information retrieval to defeat two Jeopardy! champions. In 2016…

Megaputer Intelligence

A data and text analysis firm that specializes in natural language processing

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store