digital humanities


Interactive Data Analysis and Visualizations

Data Visualization Tools for the Web

One of the major advantages of the digital humanities is the ability to work with and analyze lots of data. Naturally, one wants to visualize that data and analysis results. There are a lot of options out there when it comes tools that aid in data visualization, and some are more intuitive to use than others. While it is wonderful that there are so many tools, it can be difficult to know which ones are best for the job. Their usefulness depends on the type of data being worked with, the overall scope of the project, and how much technical expertise there is on the team.

Chart Making

If you already have the data you want to show and know what kind of visualization you are after, there are several simple chart options available, especially for web display.

Google Charts

Google Charts is a common tool used to make interactive charts on webpages using HTML, CSS, and JavaScript. This may sound a little daunting, but it was designed so that non-coders would still be able to pick it up relatively easily. There is a decent range of available chart types, and all of them are interactive, making them great for integrating into a digital report or webpage. One can either make the data table in the code themselves, or they can link to an online spreadsheet (such as Google Sheets). This is a little limiting in that this is the only way to import data sets, and if one is looking for a more complicated display, it would take a lot more fiddling with code than what might be comfortable. However, it makes great aesthetically pleasing interactive charts for simple needs.

Variance

Variance Charts also plots simple charts via HTML and CSS, but attempts to smoothen out the learning curve even further by dropping the need for JavaScript knowledge altogether. It also is a step up from google charts in that you can import csv files, instead of having to link to a google sheet. However, it does not inherently make the charts interactive. It is a good fit for quick charts with little coding knowledge, and still allows for some stylizing. As they state on their own home page, Variance is not the best choice for more complicated visualizations or data sets greater than ~20k rows.

JS Libraries

If there are project members who are knowledgeable about JavaScript and the HTML DOM/Canvas, then there are many JavaScript libraries that make custom interactive charts easier and more appealing, such as D3js, Plotlyjs, Highcharts, and ChartJS. D3js in particular can be used for complicated data analysis and mining as well. (Many of the higher-level chart makers are built on D3js). These libraries are a good option when there is a programmer on the team, when showing the data is essential, and when the visual needs a certain level of customization that can’t be reached with simpler tools. There are a huge number of these, so the best way to determine which one will be best is to take a quick glance at each and pick the one that works best for the needs or the project.

Data Visualization

For more in-depth analysis or complicated data manipulation, charting tools alone are not going to be the easiest to use. If the project is still in the analysis stage, there are a few good pieces of data analysis software.

Plot.ly

Plot.ly is an online graph maker that lets you import data from multiple sources and requires no coding knowledge to use. It does, however, require an account, and does not have the most intuitive interface. But for those looking for something simpler than Tableau or Orange and do not need their more complex analysis capabilities (or is unable to download them), Plot.ly is a great option, especially as coding knowledge is not really needed. If there is a programmer on the project, then Plot.ly also has a JavaScript library to make more customized visualizations. There is a free and paid version.

Tableau

Tableau is a desktop data visualization software. It lets you either manually define or import data sets from several different filetypes. It allows for many ways to display data, and was specifically created to help with data analysis and visualization. There is a free version (Tableau Public) and a paid version, meaning that the free version students would likely use is a little more limited. It requires no programming experience, but the graphs on Tableau public are all shared to the web publicly, and has limited visual control. However, the interactive visuals can be embedded into other web pages or downloaded as still images.

Orange

At first glance, Orange works much in the same way as Tableau, but is free and open source. It also allows for a lot more involved data mining and analysis. However, its user interface is not as easy to pick up, and to do the more involved data analysis requires more expertise. Its visuals are also not quite as sophisticated, but it is a much more robust tool for an aspiring data scientist. It does not require programming to use, but there is the option to do some scripting if one desires.

Play favorites

These are all decent programs, and in most cases there is not going to be a “best tool” for any given task. Yet one should still keep in mind the skillset and end goals when picking tools. In the end, it comes down to what needs to be accomplished, and what one likes to use and is comfortable with.