Documentation for Developers

Table of contents

Introduction

It's easy to build real-time data-driven visualisations, or add analytics to your existing applications. We provide a hosted back-end with analytics capabilities, into which you can load your data via a RESTful API.

If you want to modify your visualisation's appearance beyond what our UI allows, you can clone the dataseed-visualisation.js repo on Github - it's written entirely in js and provided as open source under the GPL2 license.

Benefits

  • Powerful cloud-hosted OLAP analytics engine

  • Support for real-time data streams

  • RESTful JSON API for importing and querying data

  • Open-source javascript front-end, built with backbone.js, gulp, bootstrap - all the good stuff

  • SVG charts build with d3.js and dc.js

  • Versatile multi-dimensional data model

  • Statistical operations including mean, min, max, variance, sum of squares and standard deviation

  • Responsive and ready for desktop / tablet / mobile

  • Paid support plans, custom development, training, and consultancy are available - just ask

Getting Started

Glossary

Dataset Like a table. Each column in the table is a field and each row an observation
Field A "column" in the Dataset
Observation A "row" in the Dataset
Measure A type of Field used in a chart as a numeric value, optionally aggregated (e.g. distance driven by a vehicle)
Dimension A type of Field used in a chart as a non-numeric dimension (e.g. a vehicle's model or make)
Visualisation A set of Elements using the Measures and Dimensions from the Visualisation's Dataset
Element A component of a Visualisation such as a chart or aggregate summary
Chart A graphical representation of a Measure and a Dimension (e.g. Number of miles driven by vehicle model)

Embedding

The simplest way to embed a visualisation is to use the "Export" button on your visualisation's toolbar. You will be provided with a snippet of HTML that you can copy-and-paste into your site.

For help on more advanced embedding, providing full control of the visualisation's appearance, have a look at dataseed-visualisation.js.

Importing

There are two ways to add data through the API, either by providing the URL of spreadsheet using the import API or by manually creating a dataset and adding observations.

Import API

First, start an import by POSTing a new import model. The URL should point to a publicly accessible CSV or XLS file that you want to import:

If the file was fetched successfully the import will start and Dataseed will respond with:

{
    "href": "/api/imports/95e2e1761e80fe5542cc95ff1c3e7ca5",
    "status": 303
}

The href in this response can be used to query the status of the import:

curl -XGET -u'dataseed@example.com:password' -H'Accept: application/json' https://dataseedapp.com/api/imports/95e2e1761e80fe5542cc95ff1c3e7ca5

When the import has finished the status attribute of the import will be set to completed:

You can view the imported data by selecting the visualisation on your datasets page or by querying the dataset and its observations.

Manual Dataset creation

Alternatively, you can import data directly through the API without the need to upload a spreadsheet. This approach is more complicated but allows greater control over the dataset's fields and data types.

Firstly, a new empty dataset and visualisation must be created. You'll need to create a field for each "column" of your data:

If the dataset was created successfully Dataseed will respond with:

{
    "href": "/api/datasets/my-dataset",
    "status": 303
}

Once you have an empty dataset you can start adding observations for each "row" of your data:

When the observations have been created successfully Dataseed will respond with:

{
    "message": "Created",
    "status": 200
}

To visualise and explore your data select the newly created dataset on your datasets page.

Drupal

For easy Drupal integration check out the Dataseed module on drupal.org

dataseed-visualisation.js

dataseed-visualisation.js allows visualisations to be integrated with existing Javascript applications. Please note that a working knowledge of Javascript is necessary to integrate, for a simpler method of including visualisations on your site see Basic Embedding.

The code consists of models and views for fetching and rendering visualisations built on the Backbone.js framework with RequireJS used for module and dependency handling.

Dependencies

The Node.js platform is used for building the Javascript and related tasks. You can install Node.js using the packages on the download page or through your OS's package manager.

Setup & Build

The README on Github contains the latest instructions for setting up and building the project. The build process is managed by Gulp and includes Javascript aggregation and minification, LESS compilation, and jasmine tests. To run the build use:

gulp

Modifying code

Only modify js or css code in the "src" directory. In order to view your changes in a browser immediately without having to build, you will need to run a web server. We have provided one using the gulp-connect plugin, just run the below command and load http://localhost:8080/index-src.html in your browser.

gulp serve

Example

The following example demonstrates a simple web application that renders a visualisation when a user clicks a button:

This example shows how to define a visualisation directly in the context of the consumer application instead of fetching it from Dataseed. For a full list of all the available attributes and their meanings see the Visualisation API section.

Architecture

Dataseed uses a standard Model/Collection/View backbone architecture where each model has an associated view for rendering and groups of models are stored together in collections. The most important components of Dataseed are listed below along with brief explanations. For more information please see the full code on Github or get in touch.

Dataset Model src/js/models/dataset.js The dataset model owns the visualisation model and handles filtering of data.
View src/js/views/dataset.js The dataset view owns the visualisation view and initiates rendering.
Visualisation Model src/js/models/visualisation.js The visualisation model owns the collection of elements (i.e. charts) and handles filtering events (addCut and removeCut) by passing them on to the dataset model.
View src/js/views/visualisation.js The visualisation view handles add, removing, re-ordering and rendering of the visualisation elements. For each element model in the elements collection a corresponding element view is created.
Elements Collection src/js/collections/elements.js The elements collection owns all the element models in a visualisation. It uses the polymorphic models pattern to keep both DimensionalElements (charts with a dimension and measure) and MeasureElements (charts with only a measure).
Element Model src/js/models/visualisation/element.js The element model fetches visualisation data through the ConnectionPool collection. It is also responsible for building chart labels/tooltips, sending filter events and providing methods to help charts interrogate the data, for example, but checking if a particular dimension is currently being filtered.
View src/js/views/element.js The element view is used to create the actual chart views and delegates all rendering to them.
Chart View src/js/views/element/d3/chart.js The base chart view is inherited by the various chart types (e.g. the bubble chart). It provides basic rendering of the chart container (title, margins, etc) as well as methods to help with filtering and styling individual charts.

Authentication

To allow embedding of private visualisations without revealing your username and password a separate HMAC-based authentication method is available. To use this authentication method you will need a way to sign messages for your users, i.e. the ability to run code server-side.

Before you start you'll need your user_token and user_key values, which can be found on your profile page. WARNING: Your user_key should never be revealed (e.g. by including it in your HTML or javascript). Anyone with access to your user_key can use it to see your private data!

The process for authenticating a user's request to an embedded visualisation is as follows:

  1. The user's browser makes a request to your site
  2. Your site calculates an auth message and HMAC and returns these to the user, along with the embedded visualisation
  3. When the dataseed javascript executes in the user's browser it will request data from dataseed using the credentials provided
  4. Dataseed will return the requested data to the user, as long as the following conditions are met:
    • The auth message is in the correct format
    • The requested dataset_id is owned by your dataseed account
    • The auth message hasn't expired
    • The HMAC is valid

The authentication message is a base64 encoded JSON object with the following keys:

  • user - Your user_token
  • dataset - Your dataset's ID
  • filters - An object containing mandatory filters on your data
  • timestamp - A unix timestamp (in UTC) indicating when the message was created

An example authentication message, before base64ing:

{
    "user": "xxx",
    "dataset": "xxx",
    "filters": {},
    "timestamp": 471484800
}

After you've calculated the auth message and HMAC (see the PHP and Python examples below) they should be included in the page after the dataseed javascript but before initialising the embedded visualisation:

require.config({
    'config': {
        'models/authSingleton': {
            'AUTH': {
                'msg': 'TGFzdCBDaHJpc3RtYXMgSSBnYXZlIHlvdSBteSBoZWFydCBCdXQgdGhlIHZlcnkgbmV4dCBkYXkgeW91IGdhdmUgaXQgYXdheS4gVGhpcyB5ZWFyIFRvIHNhdmUgbWUgZnJvbSB0ZWFycyBJJ2xsIGdpdmUgaXQgdG8gc29tZW9uZSBzcGVjaWFsCg==',
                'hmac': 'TWF5YmUgbmV4dCB5ZWFyIEknbGwgZ2l2ZSBpdCB0byBzb21lb25lIEknbGwgZ2l2ZSBpdCB0byBzb21lb25lIHNwZWNpYWwK'
            }
        }
    }
});

PHP

Python

Styles

To change text and background colours in the visualisation model an array of styles is used. Each style is represented in the array as an object with a style ID and value.

For example, to set the colour for the heading text of every chart in a visualisation the model should contain:

For the full list of styles see the Styles API. To style the visualisation in other ways (e.g. changing the font) the required elements can be targeted by your own CSS in the usual way.

Add New Chart

When using the embedded JS you can create new chart types as well as using the standard chart types. The process to create a new chart type is:

  1. Create a Backbone view that implements a render method. Whenever data is received (during the initial page load or when a cut is applied) the render method will be called to output the HTML/SVG for the chart.
  2. Add your new view class to the ElementView chart types object in ElementView.prototype.chartTypes
  3. Set the chart type in your element's model

In your chart view's render method you have access to its associated model which has a number of useful methods:

  • getObservations - Returns an array of all observations
  • hasCutId - Check if the visualisation is cut on the supplied dimension member ID
  • featureClick - Cut the visualisation on the supplied value

When creating new chart types it can be useful to extend the ChartView class as it contains a number of useful methods for dealing with cuts and styles.

The example below shows how to create a basic "list" type chart. Each observation is rendered as a link that can be clicked to cut on it.

API

Introduction

The API for Dataseed is implemented as a set of JSON resources (e.g. a Visualisation), each with their own URL. The resources may be read or altered using the standard HTTP verbs. We have tried to follow RESTful principles in the design of the API wherever possible.

The examples in this documentation use the cURL command line tool to interact with the API. If you have cURL installed (Linux and OS X users should have it by default) you can copy and paste the commands into a terminal to see them in action! For authenticated commands you'll need to replace the example account details with your own (see Authentication).

Requests

Requests to the API must always specify an Accept header with the JSON MIME type (i.e. "application/json"). So you could fetch a public dataset using:

curl -H'Accept: application/json' https://dataseedapp.com/api/datasets/mortality

When sending data to the API you must also specify the MIME type of your request using the Content-Type header header. The following example shows how to add an observation to an existing dataset:

You can also get HTML responses to GET requests made in a browser, e.g. https://dataseedapp.com/api/datasets/mortality

Responses

The are two classes of response returned by the API, indicated by the HTTP status code:

200/300: Success

The request succeeded. When creating or updating resources a 303 code will be returned. The URL the resource should be fetched from is specified in the Location header.

{
    "status": 200,
    "message": "Created"
}
400: Bad Request

The request failed because it was invalid. A general error message will be provided in the returned JSON as well as more specific errors when applicable.

{
    "status": 400,
    "message": "Invalid",
    "errors": {
        "label": "Required",
        "public": "Required"
    }
}

Authentication

All requests must be authenticated except when reading Datasets or Visualisations that are public. Currently, the only supported authentication method is HTTP Basic Authentication. The username and password should be the same ones you use to login with (if you haven't already got an account sign-up for one now).

So, for example, to fetch a private dataset you would do the following:

curl -u'dataseed@example.com:password' -H'Accept: application/json' https://dataseedapp.com/api/datasets/my-dataset

Import

To import a dataset from a file (as opposed to directly inserting observations) the import API can be used. An import with this method consists of two steps:

  1. Setup, Validation & Type Detection
  2. Background Import Process

The first step is accomplished by POSTing an import model with a publicly accessible URL for a spreadsheet file and the label and public values for the new dataset (see the examples below).

Once the import has been successfully POSTed a background process will be started that transforms the rows in the spreadsheet into observations in Dataseed. Each observation will have an ID as well as a value for every column/field. For example:

{
    "id": 1,
    "d1": "2014-12-01T00:00:00",
    "d2": 3,
    "d3": 0.18
}

In the example above the ID is "1", meaning that this observation was created from the first row of data in the spreadsheet. There are also three fields:

  • d1 - A date/time value
  • d2 - A string value represented by a dimension ID. To find the actual string value this ID corresponds to you need to GET the dimension's values (so for this example you would GET /api/datasets/my-dataset-name/dimensions/d2).
  • d3 - A numeric value. This would typically be used as a measure.

It is often helpful to GET the dataset that an import has created, to understand the different fields and the types of data they are being used for.

The Import Object

files (array) type (string) The type of import file, must be external when importing through the API.
key (string) A publicly accessible URL for a CSV, XLS or XLSX file.
data (object) label (string) The label for the dataset that will be created by the import.
public (boolean) The public/private status of the dataset created by the import.
columns (object) By default the data type of each column will be auto-detected during the import. To explicitly set the type of each column use a key of the column's index (0 based) as a string and a value of StringType, IntegerType, DecimalType, DateType or DateYearType.
status (string) The status of the import, one of pending, ready, running, failed or completed.

Create an Import

Request

The dataset field types can also be set (otherwise they will be guessed):

Response
{
   "status": 303,
   "href": "/api/imports/252b529188c2bdee4c16570a133a329d"
}

Get an Import

By making a GET request to the URL returned in the previous call we can get the status and all the other info related to our import.

Request
curl -u'dataseed@example.com:password' -H'Accept: application/json' -H'Content-type: application/json' https://dataseedapp.com/api/imports/252b529188c2bdee4c16570a133a329d'
Response

Dataset

The Dataset Object

id (string) The ID of the dataset, used in the dataset's URL.
label (string) The label of the dataset, displayed on the datasets page.
public (boolean) The public/private status of the dataset.
fields (array) id (string) The ID of the field, used to create/update dimensions and observations.
label (string) The label of the field, used as an axis label on charts and in the visualisation/element customisation interface.
type (string) The data type of the field, must be one of string, date, integer, float or geo.
imports (array) Any imports associated with the dataset, see the import object.
visualisations (array) See the visualisation object.
cut (object) Used to set a default cut (filter) on this dataset's visualisations. The key is a field ID and the value is the value to cut on.
created (string) The date and time the dataset was created
modified (string) The date and time the dataset was last updated

Get a Dataset

Request
curl -H'Accept: application/json' https://dataseedapp.com/api/datasets/mortality
Response

Create a Dataset

Request
Response
{
    "status": 303,
    "href": "/api/datasets/my-vehicle-dataset"
}

Update a Dataset

By making a PUT request to the URL returned in the previous call we can make changes to our new dataset.

Request
curl -XPUT -u'dataseed@example.com:password' -H'Accept: application/json' -H'Content-type: application/json' https://dataseedapp.com/api/datasets/my-vehicle-dataset -d'
{
    "label": "My Awesome Vehicle Dataset",
    "public": false
}'
Response
{
    "status": 303,
    "href": "/api/datasets/my-vehicle-dataset"
}

Delete a Dataset

Request
curl -XDELETE -u'dataseed@example.com:password' -H'Accept: application/json' https://dataseedapp.com/api/datasets/my-vehicle-dataset
Response
{
    "status": 200,
    "message": "Deleted"
}

List all Datasets

The datasets list is paginated and allows filtering to specify the number and type of datasets to return. The parameters should be sent in the query string of the GET request.

limit The number of datasets to return, by default this is 10.
offset The position from which to return datasets, used for pagination. For example, to return datasets 15-25 use an offset of 15.
public Set to true to retrieve only public datasets.
private Set to true to retrieve only private datasets.
shared Set to true to retrieve only datasets that have been shared with you.
Request
curl -u'dataseed@example.com:password' -H'Accept: application/json' https://dataseedapp.com/api/datasets/
Response

Visualisation

The Visualisation Object

id (string) The ID of the visualisation, used in the visualisation's URL after the dataset ID (e.g. /visualise/mortality/1).
label (string) The title of the visualisation, displayed at the top of the visualisation page.
description (string) The sub-title of the visualisation, displayed below the title on the visualisation page.
elements (array) See the element object.
styles (array) See the style object.

The Element Object

id (string) The ID of the element
type (string) The type of element, one of summary, navigation, bar, bubble, geo, table or line.
label (string) The title of the element, usually displayed at the top of a chart (dependent on the element's type).
measure_label (string) The label for the element's measure. For example, on line charts this will be the y-axis label.
aggregation (string) The aggregation to perform on the element's measure, one of sum, mean, min, max or rows. The special rows aggregation is used when the element's measure is null.
measure (object) id (string) The ID of the dataset field to use as the element's measure
dimensions (array) field (object) An object containing the ID of the dataset field to use as a dimension
bucket_interval (string) Used to bucket (also known as binning) the dimension. Allowed values are date_year, date_quarter, date_month, date_week, date_day, date_hour, date_minute, date_second or custom.
bucket (integer) If bucket_interval is custom and the dimension is an integer or float, the dimension will be bucketed by this value.
weight (integer) The weight/order of this dimension in the element. This is only relevant in multi-dimensional elements such as filters.
interactive (boolean) By default clicking on a chart feature (such as a bar or a point on a line) will filter the visualisation. If interactive is set to false clicking on the chart will have no effect.
required (boolean) Only used by the navigation element type. When set to true this will stop users from unselecting a default dataset cut, effectively making the default cut required.
sort (object) Only used by the navigation element type. Specifies the order in which the filter options are displayed in the format {"[attribute]": "[direction]"}. The attribute can be total or label and the direction can be asc or desc.
width (integer) An integer from 1 to 4 representing the width of the rendered element. Dataseed is responsive so a value of "1" will used 25% of the current screen width. On mobile, this value is ignored as elements are always full width.
height (integer) An integer from 1 to 255 representing the height of the rendered element. A value of "1" is equivalent to 50 pixels and the same height will be used across all devices.
x (integer) An integer specifying the element's X position within the visualisation.
y (integer) An integer specifying the element's Y position within the visualisation.

Elements

Summary

Displays a summary text of the current dataset cut.

Navigation

The navigation element displays all the dimensions' values as links.

Bar / Bubble / Line

A generic chart visualisation element can be defined by:

Table

A simple table of dimension and measure values:

Geo

A choropleth map:

Example

Styles

The Style Object

id (string) The ID of the style, see style types for a full list.
value (string) The value of the style, e.g. #f00.

Style types

ID Default Description
visualisationBackground #fff Background colour for the visualisation
background #f5f6f9 Background colour for the charts
heading #566573 Chart heading text colour
featureFill #72bd49 Chart feature (e.g. bars or lines) colour
featureFillActive #88939d Colour for inactive features when a chart is cut (i.e. features without a cut)
featureStroke #fff Chart feature outline colour
featureStrokeActive #fff Chart inactive feature outline colour
label #2c3e50 Chart label text colour
scaleFeature #2c3e50 Scale/axis line colour
scaleLabel #2c3e50 Scale label text colour
measureLabel #2c3e50 Measure label text colour
choroplethMin #fff Map feature (e.g. a country or a state) start colour
choroplethMax #000 Map feature end colour
choroplethStroke #000 Map feature outline colour
choroplethStrokeWidth 1 Map feature outline width

Dimensions

Dimensions allow us to store additional meta-data for dimension field values (dimension members). For example your dataset may contain a field "gender" that has values "M" and "F". We can use the Dimension model to associate the labels "Male" and "Female" which will then used by charts in the visualisation automatically.

The Dimension Object

id (string) The ID of the dimension value.
label (string) The label of the dimension value.

Get Dimension Values

Request
curl -H'Accept: application/json' 'https://dataseedapp.com/api/datasets/mortality/dimensions/gender'
Response

Create Dimension Values

Creating (POSTing) dimension values will delete any existing dimension values. To append dimension values use PUT.

Request
Response
{
    "status": 200,
    "message": "Created"
}

Update Dimension Values

Request
Response
{
    "status": 200,
    "message": "Updated"
}

Delete Dimension Values

Request
curl -XDELETE -u'dataseed@example.com:password' -H'Accept: application/json' 'https://dataseedapp.com/api/datasets/mortality/dimensions/gender'
Response
{
    "status": 200,
    "message": "Deleted"
}

Observations

The Observation Object

id (integer) The ID of the observation.
<field-id> (various) The value of the specified <field-id> for this observation. The type is dependent on the type of the field.

Get Observations

Request
curl -H'Accept: application/json' 'https://dataseedapp.com/api/datasets/mortality/observations/'
Response

Get Aggregated Observations

When requesting observations for the purposes of visualisation they are returned aggregated on a particular measure for a particular dimension. The dimension is specified in the URL and the measure with the following GET parameters:

  • measure - The field that contains the measure. This parameter is required unless the aggregation is specified as "rows".
  • aggregation - The aggregation type ("sum", "mean", "min", "max" or "rows"). This parameter is always required.

In the example below we are requesting a "sum" aggregation of the "value" measure for the "gender" dimension.

Request
curl -H'Accept: application/json' 'https://dataseedapp.com/api/datasets/mortality/observations/gender?aggregation=sum&measure=value'
Response

Get Aggregated (Unfaceted) Observations

To get "unfaceted" aggregations, that is, aggregations that are not related to a dimension's facets, we just have to omit the dimension name from the observations Read URL. For example, if we want to get a sum of the "value" measure across all the dataset's observations and dimensions:

Request
curl -H'Accept: application/json' 'https://dataseedapp.com/api/datasets/mortality/observations/?aggregation=sum&measure=value'
Response
{"total": 5057611.0}

Create Observations

Creating (POSTing) observations will delete all existing entries, to append observations use PUT.

Request
Response
{
    "status": 200,
    "message": "Created"
}

Update Observations

To replace existing observations use the same id value. If no id is passed or if the id doesn't exist in the dataset the observation will be added instead.

Request
Response
{
    "status": 200,
    "message": "Updated"
}

Delete Observations

The request to delete all observations in a dataset is shown below. To delete an individual observation within a dataset make the same request but with the observations's id value appended to the URL (e.g. /api/datasets/mortality/observations/1234).

Request
curl -XDELETE -u'dataseed@example.com:password' -H'Accept: application/json' 'https://dataseedapp.com/api/datasets/mortality/observations'
Response
{
    "status": 200,
    "message": "Deleted"
}

Use Dataseed to build your own stunning visualisations

Try it today with a 15 day trial, no credit card required.

Start Your Free Trial