Data is the new media

Data storytelling, data journalism, and even data fiction – since the advent of Big Data, we find data more and more as tool of narratives. With pattern recognition, exploratory data analytics, and especially with data visualization, data has re-centered from the quantitative to the qualitative.

More and more applications support us in using data to tell a story. Dashboards like Tableau or DataLion plug into our data sources and translate the numbers into a visual format that can be much more easily digested. Even highly multivariate data can deliver straightforward meaning to us when we use tools like Gephi, or say, the notorious Palantir. These tools also make social media analytics and text mining feasible techniques to research society, advertising, and markets.

Jawbone Up not only tracks our sleep. The app also shares our data in a meaningful way with our friends - like we share our thoughts on Twitter.
Jawbone Up not only tracks our sleep. The app also shares our data in a meaningful way with our friends – like we share our thoughts on Twitter.
Data driven storytelling has conquered most non-fiction publication. News publishers like New York Times or The Guardian employ huge teams of infographic specialists to enrich their reports with meaningful data visualization. Some of their editors have put together awesome collections of beautiful examples, e.g.

Our most personal data however is generated on our mobile and wearable devices. On our smartphones, wristbands, or smartwatches, some twenty sensors continuously track our behavior and our actions. There is a plenitude of apps making use of mobile data: To support our training, to guide our routes, to find friends nearby, to share images, etc. etc.

Many people already share their daily workout via apps like Strava or Runtastic. It is even quite common to let such apps automatically post your training results into your social media timeline, e.g. to Twitter or Facebook.
Many people already share their daily workout via apps like Strava or Runtastic. It is even quite common to let such apps automatically post your training results into your social media timeline, e.g. to Twitter or Facebook.

Apps like Jawbone Up or Strava not only track our workout, they also provide for an easy way to share what data they measured. We publish our training data the same way there, as we publish our stories on Twitter or Facebook. Our data becomes equivalent to the texts and images we post. The most highly integrated version of this data-as-story so far is Google Now.

Image on top: Google Now. Google Now follows the idea to display all kinds of information in the form of tiles, like Twitter or Facebook would display the posts of the people you follow in a timeline. Funny enough, Google obviously has no clue where my "place of work" seams to be.
Image on top: Google Now. Google Now follows the idea to display all kinds of information in the form of tiles, like Twitter or Facebook would display the posts of the people you follow in a timeline. Funny enough, Google obviously has no clue where my “place of work” seams to be.

Data is media not only regarding the content. Advertising which has by and large been data driven for decades is facing a major transformation. Media planning and buying – the art of placing ads in the most efficient way, i.e. optimizing effect for a given budget – is changing dramatically. About 20% of all ads are placed programmatic now. Programmatic buying means that an algorithm decides which exact user would be appropriate to watch the ad instead of buying the spot via explicit insertion order, as it used to be. The decision if a certain user would match with the campaign’s objective is made by predictions based on the users’ observed behavior. Data thus drives the ads we get displayed.

With the idea of ‘The Quantified Self‘, data starts to conquer even the concept of our identity. We are not only what we tell, how we appear, how we act voluntarily, but we are as well defined by our innards, by our bodies’ functions, the data that comes from our physical being. The concept of ‘self’ is changing by this notion, overcoming the strict separation of mind and body, of conscious and unconscious. The physical aspects of our lives now get equal credit, as being veritable part of our being ourselves.

Data is becoming integral part to our stories. It pervades through all the media. We should learn to see data as part of our lives the same way, we are used to tell about things with words.

Further reading:

We are content!
Data stories: From facts to fiction.

Data stories: from facts to fiction

The image above is taken from “Marx Engels Werke” (MEW): Marxism is the most prominent example of what postmodernism calls a ‘Grand Narrative’. Marx and Engels took all kinds of data, drew their conclusions, and told the one story that made sense from what they found.

Un poème n’est jamais qu’un alphabet en désordre. (Jean Cocteau)

Our time is perhaps the time of an epidemic of things. (Tristan Garcia)

I remember the elderly complaining about “information over saturation” or even “overload” when I was a child, in the early 1980. 30 years later, the change of guards comes to my generation. “But once a sponge is at capacity, new information can only replace old information.” Things like that we read in random articles every day. But what is this information that people are so afraid no longer to get, when the deluge of data has taken over?

What is data? Data is the raw-content of our experience – primary the sensory readings that get conveyed into our minds, secondary the things we measure when we try to make experiences. I don’t want to get too philosophical here, but there are quite a few thinkers who share my discomfort with connecting data with facts directly. The whole postmodernism is about deconstructing false confidence with empirical truths. A century ago, Husserl already warned us, that sciences thus might give us mediated theories rather than direct evidence. Quantitative social science, be it empirical sociology, be it experimental psychology, is in particular prone to the positivist fallacy. While throwing a dice might be correctly abstracted into a series of stochastically independent occurrences of one and the same experiment, this is almost never true for human behavior.

Let us stop taking data as facts. Let us take data as fiction instead. Let us, just for the moment, think of data as the line of a story by which we tell about our experience. There might very well be no such thing like information in the data – just the scaffolding for different narratives, that reduce the randomness and complexity. Take for example how our eyes abstract the shadow in the room’s corners to straight lines that make the edges. In fact there is no such thing as a line; if you get closer and closer to the edge, you see a rather round or uneven surface spanning between one wall and the adjoining next wall. The edge-impression is just our way to reduce our visual sensory input to a meaningful aggregate; a story.

Data as such is mostly incomprehensible. To comprehend, we have to find structure, construct causalities, reduce complexity. Data visualization is fulfilling the same task: Info graphics tell a plausible story from data, make it digestible for our mind.

The link between data and our comprehension of reality from the data is built via metaphors. A metaphor connects different things in a way, we can identify one with the other. If we summarize objects under one category, this category becomes in fact the metaphor. “‘Table’ is a word with five letters”, as Rudolph Carnap put it. The concept of a ‘table’ however is the metaphor, the image, the ideal of an arbitrary set of objects. To speak of a ‘table’ is our way to evoke an image of the concrete object we have in mind in the consciousness of our audience.

There is no law that forces us to see data as necessary, as caused, and effecting. If data would be positive, scientific progress would just be correcting errors of predecessors. But certainly this is not the case. Even what is called ‘hard science’ changes direction according to the narrative. Quantum physics was not necessary. Heisenberg’s operators are not reality in the sense that there is a factual object that changes one quantum state to the next. It is a meaningful abstraction from a reality that we cannot directly comprehend. In the same way, we may use data from social interaction, behavioral data, or economic data, and try to find a meaningful narrative to share our model of the world with others.

The narrative that we derive from data is of course by no means totally random. Of course not every narrative does fit our data points. But within our measurements, any model that does not contradict the data could be possible, and might – depending on the context – make an appropriate metaphor of our reality.

Since many narratives are possible, and a broad range of parameters can fit with our data, we should be humble when it comes to value judgements. If a decision can be justified from our data depends on the model we choose. We should be clear, that we have a choice, and that with this choice comes responsibility. We should be clear about our ethics, about the policies that guide our setting the models’ parameters. We should be aware of algorithm ethics.

We should also recognize that our data story is not free from hierarchy. It is very well possible that we impose something onto others with what is just one possible narrative; no story can be told independently from social context.

When we accept data not to be just the facts that have to add to some information, but as the hints of our story, we will be liberated from the preassure of sucking every bit in our brains. We might miss something, but that will hardly be more dramatic than in past times. Our model might not be perfect, but nevertheless, in hearing the data narrative we might catch a glimpse of what is missing. We should just let go our dogma of data as facts.

As big data becomes the ruling paradigm of empirical sciences, I hope we will see lots of inspiring data stories. I hope that data will transcend from facts to fiction. And I want to hear and tell the fairytale where we wake the sleeping beauty in data.

This is the summary of my talk “Data story telling: from facts to fiction” I gave at the Content Strategy Forum 2014 in Frankfurt:

A Bit of Data Science – What Your Battery Status Tells About You

Working with lots of data, the biggest challenge is not to store or handle this data – these jobs are far from being trivial, but there are solutions for nearly any kind of problem in this space. The real work with data starts when you ask yourself: what’s behind the data? How could you interpret this data? What story can you tell with this data? That’s what we do and we want to share some of our findings with you and motivate you to join our discussion about the meaning of the data . We want to create Data Fiction.

Today, we start with some sensor data collected by our explore app – the smartphone’s battery status including the loading process. Below you see sample data for our user’s behavior during the week (Feature Visual) and at the weekend (Figure 1).

Smartphone Battery Weekend
Figure 2: Smartphone Battery Status (weekend) (Datarella)

In Figure 1 you see that most users load their smartphones around 7 a.m. and (again) around 5 p.m. What does that tell us? First, we know when most users wake up in the morning – around 7 a.m.. Most probably they have used their smartphones’ alarm functions and then connect their devices to the power supply. Late afternoon, they load their devices a second time – probably at their office desks – before they leave their workplaces. During weekends, the loading behavior is different: people get up later, and maybe use their devices for reading, social networking or gaming, before they reconnect them to their power supplies.

Late rising leads to an avarega minimum battery status of 60% during weekends, whereas during the week, users let their smartphones batteries go down to 50%. This 10% difference is interesting, but the real surprise is the absolute minimum battery status of 50% or 60%, respectively. It seems that the days of “zero battery” and hazardous action to get your device “refilled” are completely over.

For some, data is art. And often, it’s possible to create data visualizations resembling modern art. What do you think of this piece?

Figure 2: Battery Loading Matrix (Datarella)

This matrix shows the daily smartphone loading behavior of explore users per time of day. Each color value represents a battery status (red = empty, green = full). So, you either can print it and use it as data art on your office’s wall or you think about the different loading types: some people seem to “live on the edge”, others do everything (i.e. load) to staying on the safe side of smartphone battery status.

What are your thoughts on this? When and how often do you load your mobile device? Would you describe your loading behavior as “loading on the edge” or “safe? We would love to read your thoughts! Come on – let’s create Data Fiction!

Call for Data Fiction


Do you read science fiction? Can you make data interesting? Can you tell the story behind a pool of data? Are you a data fictionista? Submit your data fiction.

People, animals, plants and things produce data – a lot of data. The data itself is the basic resource – like words are the basis for language. If you put words together to sentences and you combine sentences to chapters and aggregate several chapters – you write a story, you create fiction. Same with data: if you combine different data sources to data pools and aggregate them – you write the story behind the data, you create data fiction.

[Strong narrative] augments the available data by way of context, and extends the patience of the audience by sustaining their interest as well.

Does that sound like you?

We’d love to see and discuss your applications, analyses, case studies and models with you and help you make your data fiction become reality.

The Data
We will provide you with sample data resulting from the usage of our explore app.

The App
The data has been created by users of the explore app. In explore, the user interacts by answering surveys, attending tasks and heeding valuable recommendations based on her behavior. She immediately sees the results of her interactions in the feedback area. Second, explore tracks several sensors of the user’s phone, which can be set on and off by the user herself (see full list of sensors below). explore connects both areas, interactions and the sensor tracking area, with the integrated Complex Event Processing Engine CEPE.

datarella explore app

The Complex Event Processing Engine (CEPE)
The CEPE is a mechanism to target an efficient processing of continuous event streams in sensor networks. It enables rapid development of applications that process large volumes of incoming messages or events, regardless of whether incoming messages are historical or real-time in nature.
Our CEPE is based on ESPER and Event Processing Language EPL

List of Sensors
– GPS location data
– Network location data
– Accelerometer
– Gyroscope
– Wifi
– Magnetic field
– Battery status
– Mobile Network

– Overview and extended description or representation of your main idea, any subtopics and a conclusion
– Use or integration of at least 1 (one) category of sensor data (e.g. Gyroscope). If you use GPS location, you should use or integrate at least 1 (one) additional category of sensor data beside GPS location data.

– Presentation
– Video
– Installation

We will reward fascinating data fiction with preferred access to our data, a post on the QS Blog and the possibility of making data fiction come true.

Yes, I am a data fictionista and want to submit my data fiction!