Datomic Cloud provides indelibility and auditability, a flexible data model, ACID transactions, a powerful query language, and horizontal read scaling. Datomic’s close integration with AWS enables you to offload the administrative burdens of operating and scaling, so you don’t have to worry about hardware provisioning, setup and configuration, and capacity planning.
Datomic is designed as a general-purpose fully ACID transactional data of record system. It is a good fit for systems that store valuable information of record, require developer and operational flexibility, need history and audit capabilities, and require read scalability. Some examples of successful Datomic use cases include transactional data, business records, medical records, financial records, scientific records, inventory, configuration, web applications, departmental databases, and cloud applications.
Datomic is not a good fit for applications that need unlimited write scalability or as storage for large, unstructured data (BLOBs, media files, etc).
Datomic is consistent in both the CAP and ACID sense. Datomic transactions are always ACID. All transactions are serialized and all reads are guaranteed to be consistent.
Datomic Cloud currently provides a Clojure Client library. More information about Datomic clients can be found here.
Datomic runs on the Amazon Elastic Compute Cloud (EC2) and uses DynamoDB, Elastic File System (EFS), Simple Storage Service (S3), and several other AWS services.
Datomic Cloud is available on the Amazon Marketplace as an AMI-based product that can be quickly and easily deployed into your AWS account. Datomic Cloud is billed on a per-usage basis and must be run in AWS. Datomic is also available as Datomic On-Prem, which can run in any environment (cloud or on-premises).
Datomic Cloud and Datomic On-Prem can both be used with the same Datomic Client API. Applications that use this API are compatible with either Datomic Cloud or Datomic On-Prem. However, Datomic Cloud does not support the use of the Datomic Peer Library. Additionally, Datomic does not provide a seamless process for moving databases between Datomic Cloud and Datomic On-Prem. See the Moving To Cloud guide for more information on the similarities and differences between Datomic Cloud and Datomic On-Prem.
We intend to publish the specification for the Client wire protocol once it has been finalized. In an effort to prevent future breaking changes to the protocol, we are gathering information and user experience details prior to finalizing the specification.
Datomic represents transactions as data structures. This is a significant difference from SQL databases, where requests are submitted as strings. Using data instead of strings makes it easier to build requests programmatically.
The iProcessing Transactions section of the Datomic Documentation provides details on the syntax and usage of Datomic Transactions.
Datomic uses a simple, declarative, logic-based query langage called Datalog. Like SQL, Datalog is a declarative query language, meaning you specify what you want to know, but not how to find it. A Datalog system includes a database of facts (your Datomic database) and a set of rules.
The Datomic Query Engine takes a partially specified set of facts or rules and finds all instances of the specification implied by the database and rules. The Excecuting Queries page describes the fundamentals of buiding a Datomic Query.
Datalog is a deductive query system combining a database of facts (the Datomic db) with a set of rules for deriving new facts from existing facts and other rules. This query capability is combined with a powerful hierarchical selection facility, so you can recover tree-like data without joins or complex re-assembly. Datalog with negation is of equivalent power to relational algebra with recursion. Datalog is a great fit for application queries thanks to:
Pattern-matching like structure, in which joins are implicit
Recursion is much more straightforward than in SQL
Datalog rules subsume SQL views, but have more of a logic feel, allowing a closer alignment to business rules
The mbrainz-importer library provides an example approach to importing a dataset into Datomic.
The Examples section of the Datomic Documentation contains links to several example projects that use Datomic.
Datomic is built upon the model of data consisting of immutable values. While many queries might be interested in the 'current' facts, others might be interested in, e.g. what the product catalog looked like last month compared to this month. Incorporating time in data allows the past to be retained (or not), and supports point-in-time queries. Many real world systems have to retain all changes, and struggle mightily to efficiently provide the 'latest' view in a traditional database. This all happens automatically in Datomic. Datomic is a database of facts, not places.
A Datomic database is just a set of datoms (see below), indexed in various ways. These indexes contain all of the data, not pointers to data (i.e. they are covering indexes). The storage service and caches are just a distribution network for the data segments of these indexes, all of which are immutable, and thus effectively and coherently cached.
The flexibility provided by Datomic’s simple data model empowers modeling many different data types naturally. The Universal Relation handles row-shaped, column-shaped, graph-like, and document-like data modeling equally well. Applications written to this model are free of the structural rigidity of relational and document models.
A datom is an atomic fact that represents the addition or retraction of a relation between an entity, an attribute, a value, and a transaction. A datom is expressed as a five-tuple:
an entity id
a value for the attribute
a transaction id
a boolean indicating whether the datom is being added or retracted
No. Datomic’s Universal Relation (any entity may have any set of attributes) removes the requirement of tables. Queries can interrogate any set of entities in the database without having to resort to join tables or other mapping strategies.
Datomic supports a variety of scalar and collection datatypes. The Programming with Data and EDN section of the Datomic Documentation provides details on all supported data types.
Datomic Schema is defined using the same data model used for application data. That is, attributes are themselves entities with associated attributes. The Defining Schema Attributes page of the Datomic Documentation provides further details and examples.
No. All databases have a schema, whether they are explicit (i.e. traditional relational databases) or inferred (i.e, so called schema-less databases). Rigidity arises in systems to the extent that the schema pervades the storage representation or application access patterns, making changes to your tables or documents difficult. The schema required to encode datoms is extremely minimal, consisting primarily of the attribute definitions, which specify name, type, cardinality etc. The advantage of an explicit flexible schema definition is that it provides power to your system:
Power to efficiently access the data (via query)
Power to directly model the domain structure of your data
Power to reason about the representation of the data in the database
Datomic’s flexible universal schema and multiple spanning indexes support modeling numerous data configurations, including row-oriented, column-oriented, document-oriented, K/V, and graph-like. The Data Modeling page of the Datomic Documentation provides additional details on data modeling in Datomic.
Unlike traditional RDBMSs, the Datomic Architecture separates storage, transactions, and query. Datomic’s data model is based upon a universal primitive relation, of datoms, rather than an arbitrary set of named relations. The query model, while relational, is based upon Datalog, not SQL. Datomic has weaker constraint capabilities than some RDBMS, but more flexible data modeling. It shares ACID transactions and arbitrary joins.
Most NoSQL databases sacrifice transactionality and join capability and adopt sharding and eventual consistency. They also often have data models not supported by relational logic. Datomic trades off arbitrary write scalability to retain arbitrary transactions and joins. It provides a strong data model and powerful query with arbitrary read and query scaling.
Datomic automatically maintains a set of multiple covering indexes. These indexes each contain all of the datoms in the database, sorted in different orders, to provide efficient access to the data via multiple access patterns, including row-oriented, column-oriented, document-oriented, K/V, and graph. More information about Datomic’s indexes can be found in the Indexes Documentation.
You don’t need to specify a target index when using the Datomic query engine, as it automatically uses the most efficient index for a particular query. If you’re using the Datoms API you must specify an index explicity.
Datomic uses an efficient data representation, called fressian, which is further compressed prior to storage. Given the substantial decreases in storage costs over the past decade, the Datomic design chooses to use slightly more storage space to provide a substantially more performant system. Further, Datomic does not pay any penalty for sparse data - the absence of a given attribute on an entity does not require a 'null' value.
No. The built-in covering indexes are both automatic and required for efficient query independent of the access pattern.
Yes, Datomic includes a built-in transaction function for compare-and-swap (cas). Details about it can be found in the :db.fn/cas documentation.
No. Unlike with sharding, Datomic data can be read and queried from any node. However, you can route requests so that groups develop hot caches for different workloads. This provides horizontal scaling without the compromises typically associated with sharding. In particular:
You make no up-front decisions about where data will live.
Your programming model is not polluted by any awareness of where data lives.
At any time (and long after transacting your data), you can start new groups and route queries to them.
Datomic Cloud handles scaling automatically. Datomic’s Scaling Topology will automatically add and remove EC2 instances from a Datomic cluster based on CPU utilization, within limits you control.
Moreover, you can start additional groups serving the same system, and route different tasks to different groups. For example, you might have separate groups for transactional load, analytics queries, and support.
The Scaling Topology distributes load across multiple compute instances. In the event of an instance failure, an Application Load Balancer (ALB) will remove the instance, distributing work to the remaining instances while a new instance spins up. Databases remain available throughout an instance failover, with incrementally degraded performance.
Datomic has no single points of failure, either of machines or of disks. Datomic uses AWS storage services that provide the strongest guarantees of throughput and fault tolerance:
Datomic stores the database log in DynamoDB
Datomic stores indexes in S3
While there is no explicit limit, database performance will degrade beyond approximately 10 billion total datoms. Larger data sets can be split across multiple databases.
There are no limits for reads. You can scale reads horizontally by
autoscaling the number of instances in a query group
explicitly launching additional query groups
Writes to a single database are limited by the serial nature of Datomic transactions. Given the expected size of Datomic databases (see previous), this limit will typically only be encountered during data import.
Datomic Cloud currently runs in:
us-east-1 (N. Virginia)
Some AWS regions are missing one or more service(s) required by Datomic Cloud. If you’re interested in running in another AWS region, please let us know.
Datomic can be hosted in any supported AWS region (see previous). You should run Datomic in the same region as your application.
Datomic automatically stores every datom in multiple AWS services: S3, DynamoDB, and EFS. Each of these services is itself redundant, so you do not need to do anything to ensure data redundancy.
Amazon VPC lets you create a virtual networking environment in a private, isolated section of the AWS cloud, where you can exercise complete control over aspects such as private IP address ranges, subnets, routing tables and network gateways. Datomic Cloud always runs in a VPC, so you can keep you backend isolated from the public internet.
All Datomic data is encrypted at rest, using keys managed by AWS’s Key Management Service (KMS).
You can control read and write access at the database level using IAM policies.
Datomic uses AWS best practices for seamlessly storing and securing data. Data is stored in highly-reliable services: DynamoDB and S3. Data is always encrypted using AES-256 at rest, and keys are managed via the AWS Key Management Service (KWS).
Datomic is always accessed through an Application Load Balancer (ALB) that exposes only HTTPS.
Datomic Cloud pricing is entirely usage-based. The price per hour depends on the topology/instance type and is listed on the AWS Marketplace product page.
The charges incurred for running Datomic Cloud will be included in your monthly AWS bill.
The troubleshooting guide covers common problems accessing Datomic.
We prefer not to discuss new features prior to their public release.