Frequently Asked Questions (FAQs)
Origins of the Project
Who are you and why did you build this?
This is not a commercial product. We are researching whether it is possible to build tools to improve data literacy without making people into statisticians. We suspect that the people who have something important to say with data aren’t always going to be the folks who like to play with data for its own sake. Our hypothesis is that when better tools are built, more people might come to understand the value of their own data, and be in a better position to advocate for themselves and their community in a big data world.
We don’t really know what those tools should look like yet, so we are learning by doing – trying out new directions, seeing what works and what doesn’t.
Anthropologists – seriously?
Seriously. We spent a lot of time in the Quantified Self community, and learned a ton about what measurement can mean to people. We found QS to be a place where people take the time to reflect on things – where people can tap into that part of the brain we all have that thinks slowly, not quickly. When we take the time to reflect, measuring can be a very human thing, like writing a diary, or painting a picture.
For example, one researcher on our team was tracking the healthiness level of her food. Using an analysis similar to the “periodic pattern” tool, she discovered that her food healthiness takes a dive on Mondays and Wednesdays – exactly when her partner works late into the evening, and they go out to eat together. Seeing that pattern helped her identify other parts of her eating habits where some extra calories weren’t as enjoyable.
We made the stuff that’s boring about data analysis easy, so you can start asking real questions of your data – and getting answers – sooner.
What Data Sense Can (and Can’t) Do
What kind of data can I upload to Data Sense?
Data Sense will take numerical data, categorical data (mood was “happy” or “sad”, etc.), free text and photographs. In its earliest stages, we focused on numerical data, but plan to grow to better visualize non-numerical data. If your data is in CSV format, you can upload it. If you use some common services like Fitbit or LastFM, we can help you access your data through an API connection.
How much data can I upload?
Part of the research is to understand how different visualizations and calculations affect workloads and latency. We won’t set artificial limits, but dense datasets can experience some lag. As we learn more about how this works, we will increase how much data we can take.
What can I do with my data on Data Sense?
We’ve turned the process of exploring data into a visual interaction, so you can connect the dots faster. You can look at your data in many different ways – no more static visualizations. One click allows you to toggle between your activity levels averaged by day of week versus hour of the day. A dragging motion creates a time offset, so you can see if drinking coffee at noon affects your mood later in the day. A selection box allows you to grab all the highs in your data and plot them on a map in thirty seconds. We even pre-loaded some weather information, so you can see how it might be affecting you.
For further information and a tutorial, see our introduction video.
What are the limitations?
The analysis tools in Data Sense are based on the kinds of things that data scientists commonly do when they are preparing data. While Data Sense is really great for exploring patterns, it is not designed to prove anything scientifically. If you want to prove something that way, there are many, many other tools out there for you. We built this assuming that in everyday, practical situations, most of us are perfectly capable of reasoning through what is causing what in our lives – we’re just pretty bad at remembering how much coffee we had on Wednesday.
Uploading a File
I can't upload my file.
The thing is, we are a very small team and we can’t predict every kind of file out there. If you email it to us at email@example.com, we’ll have a look and try to fix things for you by hand. The pre-loaded sample file is a good example of a file format that is not going to give you trouble.
Why do you insist all files have a time? Isn’t a date good enough?
We need a place to plot each data point at a particular time, or our computers will explode. If you have a better estimate of when it occurred, entering it will help you compare that data with other data which has a more fine-grained time stamp.
What do the percentages mean after I’ve picked a date or a time format?
These are our estimates about how likely we think it is that you are using a particular format. While we can make pretty good guesses about what date and time format you are using, we can’t always get it right.
Why do you need so much information about my data?
There are limits to what we can offer blind. Eventually we’d like to try to offer forms of analysis that rely on knowing a little about the nature of the data, provide warnings when you try to fuse data together that don’t have the same units, etc.. If you rather we knew nothing, just choose the “other” categories and most of the tools will work just fine.
You don’t have my particular data format, but you should.
Drop us a line at firstname.lastname@example.org. If there are a few of you, we’ll try to prioritize it.
Streams, Sources and Creating Experiments
What do you mean by data streams vs data sources?
A data source is the place where data comes from. It could be a file you upload, or it could be an API connection to a service like Mood Panda or Open Paths. Within each source there can be many data streams – individual datasets like sleep hours, mood score, or annotations.
Many privacy advocates have argued that companies should give people more granular control over what data goes where. They inspired us to enable user control over what data gets uploaded at the individual stream level, not the source level.
What do you mean by experiment?
An "experiment" is a way to visualize a small handful of data at a time, from any data source. You can mix and match data from anywhere, and load in as much data as you want. You can make as many experiments as you'd like, and come back to them later.
I know I had more data than this. What happened?
When we ask for data on your behalf from a service, some services place limits on how much data can be sent over in any 24 hour period. If you check back, eventually Data Sense will fill in the complete picture.
I use my Fitbit (or similar) all the time. Will new data be automatically uploaded?
When data comes from a service, it is automatically updated every time you sign in by default. All your experiments will reflect the new data. If your data comes from a file, and you update that file regularly, you can append the file you already have on Data Sense, so it’s almost like an update.
I don’t want my experiments to automatically update with new data. How can I turn this off?
Go into Sources, click on the individual source and uncheck “auto sync.”
What is a temporary stream?
A temporary stream is a set of data that you create from within an experiment. It only exists within that experiment. It will persist within that experiment until you put it in the trash can.
Using the Time Series View
What am I really selecting when I select a portion of my data?
In any part of Data Sense, you can select data directly on the graph by clicking and dragging to create a box. You can see an example of this in our introduction video. In the time series, the box will select only the data that is both inside the box, and the same color as the box.
I set up an experiment, but I want to put different data in it now.
While you are in the experiment, use the “Edit” button next to the experiment name.
I am taking a rolling average/making quartiles/etc.. Do these tools apply to all the data on my screen?
No. They only apply to a single stream at a time. You can tell which one it is because the stream line will be heavier, the circle next to its name will be bigger, and the tool will usually have the same color as the data. If you are in the rows tool, look for the enlarged circle.
If I do a rolling average/time shift, will it affect my data elsewhere?
These tools only apply to the experiment you are in, but they do apply across the whole experiment. This means that the data you uploaded will remain the same in our database, but within an experiment, if you go over to periodic pattern or any of the other ways to look at your data, you will be looking at rolling averaged or time-shifted data.
Can I change the data itself?
Yes, through the annotation tool . Be warned, though, this will change the underlying data within Data Sense, and your changes will be reflected everywhere. The only way to unchange it is to reupload the original file or unconnect and reconnect from a service.
I want to see whether my data fell in a target range.
Select the Rows tool . Grab one of the black dots and slide it up or down so that one of the colors reflects your target range. It might help to make the remaining colors white (the color options fly out if you click inside the more saturated box.) You can create a new dataset of the data in that target range by clicking inside the relevant translucent area.
Using the Periodic Patterns View
What is the Periodic Pattern view actually showing?
By default, it shows the average of all data points in the indicated time period. To change the time periods, use the Time Bins menu. To make the bars represent something other than an average, use the “Menu” under the data stream name, and choose “Aggregation.”
The preset time bins don’t reflect my schedule. Can I customize them to make my own “lunchtimes” or “weekends”?
Yes. Click on “New” in the Time Bins menu.
Using the Flower View
I’m totally lost. What am I supposed to use this thing for?
The flower enables you to look at many, many data streams at once to see if they are related. This video explains more.
What do the colors mean?
The colors indicate whether the data is high, medium or low, etc..
To me, high calories should be green, not red. How do I change this?
You can change the colors to something better suited to the data by clicking on “Thresholds” in the menu underneath the data stream name. Click in the color box itself, and other color options appear. You can also change them on the time series by clicking on .
Do the tiles represent individual data points?
No. The tiles show an aggregation of whatever data falls in that time slot (usually an average). Bear in mind that there are two time criteria that determine each tile. A tile might represent all the Saturdays in November, and its neighbor all the Sundays in November. It is possible, of course, that your data is sparse enough that there is only one data point that would fall into a given time slot.
How can I create more gradation than high/medium/low?
From within the Flower, go to the menu under the data stream name and click on “Thresholds.” From there, if you place your cursor on a colored area, boxes will fly out to your left and right. Click on the box to the left and you will break it up into two. To merge two levels into one, drag the black dot down or up into the next one.
For more precision go over to the Time Series view and use the “Rows” tool . Anything you do there will be reflected in the flower view.
Can I rearrange how data is shown in the flower?
Yes, in a zillion ways. You can slide up or down anything that has a checkbox next to it by clicking and dragging the text. You can reposition the whole thing by dragging any one of these icons to the Data/Time 1/Time 2 position:
In fact, we encourage you to do as much arranging and rearraging as possible. It’s a good way to stumble into patterns you weren’t necessarily looking for.
Using the Map View
Why are these map pins black? My data streams are red and blue.
Often data points overlap when plotted on a map. The black tells you that there are points from multiple data streams collected at a single place.
I get an error when I try to use the map.
The map only works when you have some location data.
- Your data is really yours. We claim no ownership over it.
- You have use rights over any visualization you make with Data Sense. Publish them, put them on your wall, we don’t care.
- This is a research project. We have no interest in doing anything with your data other than figuring out how tools like this this should be built. No selling, no advertising, etc.. Though we really only care about the aggregate patterns of site usage, our team can technically see all the details. If you would rather we didn’t know something about you, don’t upload it, or disguise it. If you subtract 20 pounds from your weight before you upload it, and no one but you will know the difference, and the tools will work just fine.
- We’re not a data roach motel. If you need your data back, there’s an export button. If for any reason you have trouble with the export, contact us. We’ll fetch it for you.
Frankly, we’re a small team and we’d rather not spend the cycles doing it. We tried to future proof it as much as possible, but you just can’t predict everything ahead of time. It’s hard to say, but we’ll do our best to avoid it.
The Features We Don’t Have (Yet)
Can I calculate correlations?
We’re working on it.
My life would be improved if you would build this other feature.
We love hearing from people. Email us at email@example.com and we’ll see what we can do. It could be on our roadmap anyway – expect improvements to arrive about once a month.
Something's not working. How do I get help?