Datomic Cloud FAQ

What is Datomic?

What kind of database is Datomic?

Datomic Cloud provides indelibility and auditability, a flexible data model, ACID transactions, a powerful query language, and horizontal read scaling. Datomic’s close integration with AWS enables you to offload the administrative burdens of operating and scaling, so you don’t have to worry about hardware provisioning, setup and configuration, and capacity planning.

What sorts of applications is Datomic designed for?

Datomic is designed as a general-purpose fully ACID transactional data of record system. It is a good fit for systems that store valuable information of record, require developer and operational flexibility, need history and audit capabilities, and require read scalability. Some examples of successful Datomic use cases include transactional data, business records, medical records, financial records, scientific records, inventory, configuration, web applications, departmental databases, and cloud applications.

What sort of applications is Datomic not a good fit for?

Datomic is not a good fit for applications that need unlimited write scalability or as storage for large, unstructured data (BLOBs, media files, etc).

What is Datomic’s consistency model and does Datomic support ACID transactions?

Datomic is consistent in both the CAP and ACID sense. Datomic transactions are always ACID. All transactions are serialized and all reads are guaranteed to be consistent.

What languages does Datomic Cloud support?

Datomic Cloud currently provides a Clojure Client library. More information about Datomic clients can be found here.

Can Datomic be used by applications running on any OS?

Yes, the only requirement for using Datomic is an application that uses one of the Datomic client libraries and HTTP access to the Datomic system. Step by step instructions are available in the Getting Started section of the Datomic Documentation.

What software & services does Datomic run on?

Datomic runs on the Amazon Elastic Compute Cloud (EC2) and uses DynamoDB, Elastic File System (EFS), Simple Storage Service (S3), and several other AWS services.

What is the relationship between Cognitect and Datomic?

Cognitect is the company behind the Datomic product. Cognitect has almost 15 years experience building software for itself and for its clients. Cognitect can help you design, build, launch and support your Datomic-based systems. Get in touch.

What are the different versions? Which one is right for me?

Datomic Cloud is available on the Amazon Marketplace as an AMI-based product that can be quickly and easily deployed into your AWS account. Datomic Cloud is billed on a per-usage basis and must be run in AWS. Datomic is also available as Datomic On-Prem, which can run in any environment (cloud or on-premises).

If I build a system on version X, can I port it to version Y?

Datomic Cloud and Datomic On-Prem can both be used with the same Datomic Client API. Applications that use this API are compatible with either Datomic Cloud or Datomic On-Prem. However, Datomic Cloud does not support the use of the Datomic Peer Library. Additionally, Datomic does not provide a seamless process for moving databases between Datomic Cloud and Datomic On-Prem. See the Moving To Cloud guide for more information on the similarities and differences between Datomic Cloud and Datomic On-Prem.

Will I be locked in to Datomic?

No. The data in Datomic can be exported via datalog queries or direct index access to a file and format of your choice.

Will you publish the client protocol so I can write my own Datomic Client?

We intend to publish the specification for the Client wire protocol once it has been finalized. In an effort to prevent future breaking changes to the protocol, we are gathering information and user experience details prior to finalizing the specification.

Getting Started

How do I get started using Datomic?

The Getting Started section of the Datomic Documentation provides a step-by-step guide to starting Datomic in your AWS account and accessing it via the Client library.

How do I perform transactions with Datomic?

Datomic represents transactions as data structures. This is a significant difference from SQL databases, where requests are submitted as strings. Using data instead of strings makes it easier to build requests programmatically.

The iProcessing Transactions section of the Datomic Documentation provides details on the syntax and usage of Datomic Transactions.

How do I query Datomic?

Datomic uses a simple, declarative, logic-based query langage called Datalog. Like SQL, Datalog is a declarative query language, meaning you specify what you want to know, but not how to find it. A Datalog system includes a database of facts (your Datomic database) and a set of rules.

The Datomic Query Engine takes a partially specified set of facts or rules and finds all instances of the specification implied by the database and rules. The Excecuting Queries page describes the fundamentals of buiding a Datomic Query.

How does Datalog differ from SQL?

Datalog is a deductive query system combining a database of facts (the Datomic db) with a set of rules for deriving new facts from existing facts and other rules. This query capability is combined with a powerful hierarchical selection facility, so you can recover tree-like data without joins or complex re-assembly. Datalog with negation is of equivalent power to relational algebra with recursion. Datalog is a great fit for application queries thanks to:

  • Pattern-matching like structure, in which joins are implicit

  • Recursion is much more straightforward than in SQL

  • Datalog rules subsume SQL views, but have more of a logic feel, allowing a closer alignment to business rules

How do I import an existing dataset?

The mbrainz-importer library provides an example approach to importing a dataset into Datomic.

How can I get help using Datomic?

Answers to many common questions can be found in the Datomic Documentation. Additionally, the Datomic Developers Forum is the official community discussion and support forum.

Where can I find examples of Datomic usage?

The Examples section of the Datomic Documentation contains links to several example projects that use Datomic.

Do you offer professional services to help with my project?

Cognitect, the company behind Datomic, has almost 15 years of professional services experience in the software industry. More info about Cognitect services can be found here. Feel free to get in touch about your needs for design, review, development or operational help.

Data Model/Schema

What is Datomic’s data model?

Datomic is built upon the model of data consisting of immutable values. While many queries might be interested in the 'current' facts, others might be interested in, e.g. what the product catalog looked like last month compared to this month. Incorporating time in data allows the past to be retained (or not), and supports point-in-time queries. Many real world systems have to retain all changes, and struggle mightily to efficiently provide the 'latest' view in a traditional database. This all happens automatically in Datomic. Datomic is a database of facts, not places.

A Datomic database is just a set of datoms (see below), indexed in various ways. These indexes contain all of the data, not pointers to data (i.e. they are covering indexes). The storage service and caches are just a distribution network for the data segments of these indexes, all of which are immutable, and thus effectively and coherently cached.

The flexibility provided by Datomic’s simple data model empowers modeling many different data types naturally. The Universal Relation handles row-shaped, column-shaped, graph-like, and document-like data modeling equally well. Applications written to this model are free of the structural rigidity of relational and document models.

What is a datom?

A datom is an atomic fact that represents the addition or retraction of a relation between an entity, an attribute, a value, and a transaction. A datom is expressed as a five-tuple:

  • an entity id

  • an attribute

  • a value for the attribute

  • a transaction id

  • a boolean indicating whether the datom is being added or retracted

Does Datomic have tables?

No. Datomic’s Universal Relation (any entity may have any set of attributes) removes the requirement of tables. Queries can interrogate any set of entities in the database without having to resort to join tables or other mapping strategies.

What data types are supported by Datomic?

Datomic supports a variety of scalar and collection datatypes. The Programming with Data and EDN section of the Datomic Documentation provides details on all supported data types.

How do I create my schema?

Datomic Schema is defined using the same data model used for application data. That is, attributes are themselves entities with associated attributes. The Defining Schema Attributes page of the Datomic Documentation provides further details and examples.

Does Datomic support schema-less data?

No. All databases have a schema, whether they are explicit (i.e. traditional relational databases) or inferred (i.e, so called schema-less databases). Rigidity arises in systems to the extent that the schema pervades the storage representation or application access patterns, making changes to your tables or documents difficult. The schema required to encode datoms is extremely minimal, consisting primarily of the attribute definitions, which specify name, type, cardinality etc. The advantage of an explicit flexible schema definition is that it provides power to your system:

  • Power to efficiently access the data (via query)

  • Power to directly model the domain structure of your data

  • Power to reason about the representation of the data in the database

How should I model my data in Datomic?

Datomic’s flexible universal schema and multiple spanning indexes support modeling numerous data configurations, including row-oriented, column-oriented, document-oriented, K/V, and graph-like. The Data Modeling page of the Datomic Documentation provides additional details on data modeling in Datomic.

How do I prevent users from adding invalid data?

How is Datomic different from a traditional RDBMS? A NoSQL database?

Unlike traditional RDBMSs, the Datomic Architecture separates storage, transactions, and query. Datomic’s data model is based upon a universal primitive relation, of datoms, rather than an arbitrary set of named relations. The query model, while relational, is based upon Datalog, not SQL. Datomic has weaker constraint capabilities than some RDBMS, but more flexible data modeling. It shares ACID transactions and arbitrary joins.

Most NoSQL databases sacrifice transactionality and join capability and adopt sharding and eventual consistency. They also often have data models not supported by relational logic. Datomic trades off arbitrary write scalability to retain arbitrary transactions and joins. It provides a strong data model and powerful query with arbitrary read and query scaling.

What indexes does Datomic use?

Datomic automatically maintains a set of multiple covering indexes. These indexes each contain all of the datoms in the database, sorted in different orders, to provide efficient access to the data via multiple access patterns, including row-oriented, column-oriented, document-oriented, K/V, and graph. More information about Datomic’s indexes can be found in the Indexes Documentation.

Against which index should I target my query?

You don’t need to specify a target index when using the Datomic query engine, as it automatically uses the most efficient index for a particular query. If you’re using the Datoms API you must specify an index explicity.

Don’t all those indexes take up a lot of storage space?

Datomic uses an efficient data representation, called fressian, which is further compressed prior to storage. Given the substantial decreases in storage costs over the past decade, the Datomic design chooses to use slightly more storage space to provide a substantially more performant system. Further, Datomic does not pay any penalty for sparse data - the absence of a given attribute on an entity does not require a 'null' value.

Can I remove an index?

No. The built-in covering indexes are both automatic and required for efficient query independent of the access pattern.

Does Datomic support conditional operations?

Yes, Datomic includes a built-in transaction function for compare-and-swap (cas). Details about it can be found in the :db.fn/cas documentation.

Availability & Scaling

Does Datomic use sharding?

No. Unlike with sharding, Datomic data can be read and queried from any node. However, you can route requests so that groups develop hot caches for different workloads. This provides horizontal scaling without the compromises typically associated with sharding. In particular:

  • You make no up-front decisions about where data will live.

  • Your programming model is not polluted by any awareness of where data lives.

  • At any time (and long after transacting your data), you can start new groups and route queries to them.

How do I scale the computing resources associated with Datomic Cloud?

Datomic Cloud handles scaling automatically. Datomic’s Scaling Topology will automatically add and remove EC2 instances from a Datomic cluster based on CPU utilization, within limits you control.

Moreover, you can start additional groups serving the same system, and route different tasks to different groups. For example, you might have separate groups for transactional load, analytics queries, and support.

How does Datomic minimize recovery time during failover?

The Scaling Topology distributes load across multiple compute instances. In the event of an instance failure, an Application Load Balancer (ALB) will remove the instance, distributing work to the remaining instances while a new instance spins up. Databases remain available throughout an instance failover, with incrementally degraded performance.

How does Datomic improve fault tolerance to disk failures?

Datomic has no single points of failure, either of machines or of disks. Datomic uses AWS storage services that provide the strongest guarantees of throughput and fault tolerance:

  • Datomic stores the database log in DynamoDB

  • Datomic stores indexes in S3

How big can a database be?

While there is no explicit limit, database performance will degrade beyond approximately 10 billion total datoms. Larger data sets can be split across multiple databases.

What are the scaling limits of Datomic?

There are no limits for reads. You can scale reads horizontally by

  • autoscaling the number of instances in a query group

  • explicitly launching additional query groups

Writes to a single database are limited by the serial nature of Datomic transactions. Given the expected size of Datomic databases (see previous), this limit will typically only be encountered during data import.

In which AWS regions is Datomic available?

Datomic Cloud currently runs in:

  • us-east-1 (N. Virginia)

  • us-east-2 (Ohio)

  • us-west-2 (Oregon)

  • eu-central-1 (Frankfurt)

  • eu-west-1 (Ireland)

  • ap-southeast-2 (Sydney)

Some AWS regions are missing one or more service(s) required by Datomic Cloud. If you’re interested in running in another AWS region, please let us know.

Can Datomic be hosted in the same region as my application?

Datomic can be hosted in any supported AWS region (see previous). You should run Datomic in the same region as your application.

How do I enable ensure that my data is stored redundantly?

Datomic automatically stores every datom in multiple AWS services: S3, DynamoDB, and EFS. Each of these services is itself redundant, so you do not need to do anything to ensure data redundancy.

Security and Access Control

What is Amazon Virtual Private Cloud and how does it work with Datomic?

Amazon VPC lets you create a virtual networking environment in a private, isolated section of the AWS cloud, where you can exercise complete control over aspects such as private IP address ranges, subnets, routing tables and network gateways. Datomic Cloud always runs in a VPC, so you can keep you backend isolated from the public internet.

Can Datomic encrypt sensitive data such as protected health information (PHI) and personally identifiable information (PII)??

All Datomic data is encrypted at rest, using keys managed by AWS’s Key Management Service (KMS).

How do I set up users and permissions?

You can control read and write access at the database level using IAM policies.

How safe is my data?

Datomic uses AWS best practices for seamlessly storing and securing data. Data is stored in highly-reliable services: DynamoDB and S3. Data is always encrypted using AES-256 at rest, and keys are managed via the AWS Key Management Service (KWS).

Does Datomic Cloud provide SSL/HTTPS access?

Datomic is always accessed through an Application Load Balancer (ALB) that exposes only HTTPS.

Pricing

How much does Datomic cost?

Datomic Cloud pricing is entirely usage-based. The price per hour depends on the topology/instance type and is listed on the AWS Marketplace product page.

When am I billed?

The charges incurred for running Datomic Cloud will be included in your monthly AWS bill.

Troubleshooting

I can’t connect to my Datomic database.

The troubleshooting guide covers common problems accessing Datomic.

Feature Requests

What about feature X?

We are eager to learn about your use of Datomic, and what could make it better. With a Receptive.io account you can propose and vote for feature requests.

When will feature X be available?

We prefer not to discuss new features prior to their public release.

Where can I find out about new features?

You can see release announcements on the Datomic Forum or by following the Datomic Team on Twitter (@datomic_team).